Vertica meta-functions

Syntax

DROP_LICENSE( 'license-name' )

Parameters

license-name: The name of the license to drop. Use the name (or long license key) in the NAME column of system table LICENSES.

Privileges

Superuser

Examples

=> SELECT DROP_LICENSE('9b2d81e2-aab1-4cfb-bc07-fa9a696e8f5e');

1.2 - DUMP_CATALOG

Returns an internal representation of the Vertica catalog.

Returns an internal representation of the Vertica catalog. This function is used for diagnostic purposes.

DUMP_CATALOG returns only the objects that are visible to the user.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DUMP_CATALOG()

Privileges

None

Examples

The following query obtains an internal representation of the Vertica catalog:

=> SELECT DUMP_CATALOG();

The output is written to the specified file:

\o /tmp/catalog.txt
SELECT DUMP_CATALOG();
\o

1.3 - EXPORT_CATALOG

This function and EXPORT_OBJECTS return equivalent output.

Note

This function and EXPORT_OBJECTS return equivalent output.

Generates a SQL script for recreating a physical schema design on another cluster.

The SQL script conforms to the following requirements:

Only includes objects to which the user has access.
Orders CREATE statements according to object dependencies so they can be recreated in the correct sequence. For example, if a table is in a non-PUBLIC schema, the required CREATE SCHEMA statement precedes the CREATE TABLE statement. Similarly, a table's CREATE ACCESS POLICY statement follows the table's CREATE TABLE statement.
If possible, creates projections with their KSAFE clause, if any, otherwise with their OFFSET clause.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

EXPORT_CATALOG ( ['[destination]' [, '[scope]']] )

Parameters

Note

If you omit all parameters, this function exports to standard output all objects to which you have access.

destination

Specifies where to send output, one of the following:

An empty string ('') writes the script to standard output.
The path and name of a SQL output file. This option is valid only for superusers. If you specify a file that does not exist, the function creates one. If you specify only a file name, Vertica creates it in the catalog directory. If the file already exists, the function silently overwrites its contents.

scope

Determines what to export. Within the specified scope, EXPORT_CATALOG exports all the objects to which you have access:

DESIGN (default): Exports all catalog objects: schemas, tables, constraints, views, access policies, projections, SQL macros, and stored procedures.
DESIGN_ALL: Deprecated
TABLES: Exports all tables and their access policies. See also EXPORT_TABLES.
DIRECTED_QUERIES: Exports all directed queries that are stored in the database. For details, see Managing directed queries.

Privileges

None

Examples

See Exporting the catalog.

1.4 - EXPORT_OBJECTS

This function and EXPORT_CATALOG return equivalent output.

Note

This function and EXPORT_CATALOG return equivalent output.

Generates a SQL script you can use to recreate non-virtual catalog objects on another cluster.

The SQL script conforms to the following requirements:

Only includes objects to which the user has access.
Orders CREATE statements according to object dependencies so they can be recreated in the correct sequence. For example, if a table is in a non-PUBLIC schema, the required CREATE SCHEMA statement precedes the CREATE TABLE statement. Similarly, a table's CREATE ACCESS POLICY statement follows the table's CREATE TABLE statement.
If possible, creates projections with their KSAFE clause, if any, otherwise with their OFFSET clause.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

EXPORT_OBJECTS( ['[destination]' [, '[scope]'] [, 'mark-ksafe']] )

Parameters

Note

If you omit all parameters, this function exports to standard output all objects to which you have access.

destination

Specifies where to send output, one of the following:

An empty string ('') writes the script to standard output.
The path and name of a SQL output file. This option is valid only for superusers. If you specify a file that does not exist, the function creates one. If you specify only a file name, Vertica creates it in the catalog directory. If the file already exists, the function silently overwrites its contents.

scope

Specifies one or more objects to export as a comma-delimited list:

{ [database.]schema[.object] | [[database.]schema]object }[,...]

If set to an empty string, Vertica exports all objects to which the user has access.
If you specify a schema only, Vertica exports all objects in that schema.
If you specify a database, it must be the current database.

mark-ksafe

Boolean argument, specifies whether the generated script calls the Vertica function MARK_DESIGN_KSAFE . If set to true (default), MARK_DESIGN_KSAFE uses the correct K-safe argument for the current database.

Privileges

None

Examples

See Exporting objects.

1.5 - EXPORT_TABLES

Generates a SQL script that can be used to recreate a logical schema—schemas, tables, constraints, and views—on another cluster.

Generates a SQL script that can be used to recreate a logical schema—schemas, tables, constraints, and views—on another cluster. EXPORT_TABLES only exports objects to which the user has access.

The SQL script conforms to the following requirements:

Only includes objects to which the user has access.
Orders CREATE statements according to object dependencies so they can be recreated in the correct sequence. For example, if a table references a named sequence, a CREATE SEQUENCE statement precedes the CREATE TABLE statement. Similarly, a table's CREATE ACCESS POLICY statement follows the table's CREATE TABLE statement.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

EXPORT_TABLES( ['[destination]' [, '[scope]']] )

Note

If you omit all parameters, EXPORT_CATALOG exports to standard output all tables to which you have access.

Parameters

destination

Specifies where to send output, one of the following:

An empty string ('') writes the script to standard output.
The path and name of a SQL output file. This option is valid only for superusers. If you specify a file that does not exist, the function creates one. If you specify only a file name, Vertica creates it in the catalog directory. If the file already exists, the function silently overwrites its contents.

scope

Specifies one or more tables to export, as follows:

[database.]schema[.table][,...]

If set to an empty string, Vertica exports all non-virtual table objects to which you have access, including table schemas, sequences, and constraints.
If you specify a schema, Vertica exports all non-virtual table objects in that schema.
If you specify a database, it must be the current database.

Privileges

None

Examples

See Exporting tables.

1.6 - INSTALL_LICENSE

Installs the license key in the global catalog.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

INSTALL_LICENSE( 'filename' )

Parameters

filename: The absolute path name of a valid license file.

Privileges

Superuser

Examples

=> SELECT INSTALL_LICENSE('/tmp/vlicense.dat');

1.7 - MARK_DESIGN_KSAFE

Enables or disables high availability in your environment, in case of a failure.

Enables or disables high availability in your environment, in case of a failure. Before enabling recovery, MARK_DESIGN_KSAFE queries the catalog to determine whether a cluster's physical schema design meets the following requirements:

Small, unsegmented tables are replicated on all nodes.
Large table superprojections are segmented with each segment on a different node.
Each large table projection has at least one buddy projection for K-safety=1 (or two buddy projections for K-safety=2).

Buddy projections are also segmented across database nodes, but the distribution is modified so segments that contain the same data are distributed to different nodes. See High availability with projections.

MARK_DESIGN_KSAFE does not change the physical schema.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

MARK_DESIGN_KSAFE ( k )

Parameters

k

Specifies the level of K-safety, one of the following:

2: Enables high availability if the schema design meets requirements for K-safety=2
1: Enables high availability if the schema design meets requirements for K-safety=1
0: Disables high availability

Privileges

Superuser

Return messages

If you specify a k value of 1 or 2, Vertica returns one of the following messages.

Success:

 Marked design n-safe

Failure:

 The schema does not meet requirements for K=n.
 Fact table projection projection-name
 has insufficient "buddy" projections.

where n is a K-safety setting.

Notes

The database's internal recovery state persists across database restarts but it is not checked at startup time.
When one node fails on a system marked K-safe=1, the remaining nodes are available for DML operations.

Examples

=> SELECT MARK_DESIGN_KSAFE(1);
  mark_design_ksafe
----------------------
 Marked design 1-safe
(1 row)

If the physical schema design is not K-safe, messages indicate which projections do not have a buddy:

=> SELECT MARK_DESIGN_KSAFE(1);
The given K value is not correct;
the schema is 0-safe
Projection pp1 has 0 buddies,
which is smaller that the given K of 1
Projection pp2 has 0 buddies,
which is smaller that the given K of 1
.
.
.
(1 row)

1.8 - RELOAD_ADMINTOOLS_CONF

Updates the admintools.conf on each UP node in the cluster.

Updates the admintools.conf on each UP node in the cluster. Updates include:

IP addresses and catalog paths
Node names for all nodes in the current database

This function provides a manual method to instruct the server to update admintools.conf on all UP nodes. For example, if you restart a node, call this function to confirm its admintools.conf file is accurate.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

RELOAD_ADMINTOOLS_CONF()

Privileges

Examples

Update admintools.conf on each UP node in the cluster:

=> SELECT RELOAD_ADMINTOOLS_CONF();
  RELOAD_ADMINTOOLS_CONF
--------------------------
 admintools.conf reloaded
(1 row)

2 - Client connection management functions

This section contains client connection management functions specific to Vertica.

2.1 - DESCRIBE_LOAD_BALANCE_DECISION

Evaluates if any load balancing routing rules apply to a given IP address and This function is useful when you are evaluating connection load balancing policies you have created, to ensure they work the way you expect them to.

Evaluates if any load balancing routing rules apply to a given IP address and describes how the client connection would be handled. This function is useful when you are evaluating connection load balancing policies you have created, to ensure they work the way you expect them to.

You pass this function an IP address of a client connection, and it uses the load balancing routing rules to determine how the connection will be handled. The logic this function uses is the same logic used when Vertica load balances client connections, including determining which nodes are available to handle the client connection.

This function assumes the client connection has opted into being load balanced. If actual clients have not opted into load balancing, the connections will not be redirected. See Load balancing in ADO.NET, Load balancing in JDBC, and Load balancing in ODBC, for information on enabling load balancing on the client. For vsql, use the -C command-line option to enable load balancing.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DESCRIBE_LOAD_BALANCE_DECISION('ip_address')

Arguments

'ip_address': An IP address of a client connection to be tested against the load balancing rules. This can be either an IPv4 or IPv6 address.

Return value

A step-by-step description of how the load balancing rules are being evaluated, including the final decision of which node in the database has been chosen to service the connection.

Privileges

None.

Examples

The following example demonstrates calling DESCRIBE_LOAD_BALANCE_DECISION with three different IP addresses, two of which are handled by different routing rules, and one which is not handled by any rule.

=> SELECT describe_load_balance_decision('192.168.1.25');
                        describe_load_balance_decision
--------------------------------------------------------------------------------
 Describing load balance decision for address [192.168.1.25]
Load balance cache internal version id (node-local): [2]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address matches this rule
Matched to load balance group [group_1] the group has policy [ROUNDROBIN]
number of addresses [2]
(0) LB Address: [10.20.100.247]:5433
(1) LB Address: [10.20.100.248]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [10.20.100.248] port [5433]

(1 row)

=> SELECT describe_load_balance_decision('192.168.2.25');
                        describe_load_balance_decision
--------------------------------------------------------------------------------
 Describing load balance decision for address [192.168.2.25]
Load balance cache internal version id (node-local): [2]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address does not match source ip filter for this rule.
Considered rule [subnet_192] source ip filter [192.0.0.0/8]... input address
matches this rule
Matched to load balance group [group_all] the group has policy [ROUNDROBIN]
number of addresses [3]
(0) LB Address: [10.20.100.247]:5433
(1) LB Address: [10.20.100.248]:5433
(2) LB Address: [10.20.100.249]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [10.20.100.248] port [5433]

(1 row)

=> SELECT describe_load_balance_decision('1.2.3.4');
                         describe_load_balance_decision
--------------------------------------------------------------------------------
 Describing load balance decision for address [1.2.3.4]
Load balance cache internal version id (node-local): [2]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address does not match source ip filter for this rule.
Considered rule [subnet_192] source ip filter [192.0.0.0/8]... input address
does not match source ip filter for this rule.
Routing table decision: No matching routing rules: input address does not match
any routing rule source filters. Details: [Tried some rules but no matching]
No rules matched. Falling back to classic load balancing.
Classic load balance decision: Classic load balancing considered, but either
the policy was NONE or no target was available. Details: [NONE or invalid]

(1 row)

The following example demonstrates calling DESCRIBE_LOAD_BALANCE_DECISION repeatedly with the same IP address. You can see that the load balance group's ROUNDROBIN load balance policy has it switch between the two nodes in the load balance group:

=> SELECT describe_load_balance_decision('192.168.1.25');
                       describe_load_balance_decision
--------------------------------------------------------------------------------
 Describing load balance decision for address [192.168.1.25]
Load balance cache internal version id (node-local): [1]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address matches this rule
Matched to load balance group [group_1] the group has policy [ROUNDROBIN]
number of addresses [2]
(0) LB Address: [10.20.100.247]:5433
(1) LB Address: [10.20.100.248]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [10.20.100.248]
port [5433]

(1 row)

=> SELECT describe_load_balance_decision('192.168.1.25');
                        describe_load_balance_decision
--------------------------------------------------------------------------------
 Describing load balance decision for address [192.168.1.25]
Load balance cache internal version id (node-local): [1]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address matches this rule
Matched to load balance group [group_1] the group has policy [ROUNDROBIN]
number of addresses [2]
(0) LB Address: [10.20.100.247]:5433
(1) LB Address: [10.20.100.248]:5433
Chose address at position [0]
Routing table decision: Success. Load balance redirect to: [10.20.100.247]
port [5433]

(1 row)

=> SELECT describe_load_balance_decision('192.168.1.25');
                         describe_load_balance_decision
--------------------------------------------------------------------------------
 Describing load balance decision for address [192.168.1.25]
Load balance cache internal version id (node-local): [1]
Considered rule [etl_rule] source ip filter [10.20.100.0/24]... input address
does not match source ip filter for this rule.
Considered rule [internal_clients] source ip filter [192.168.1.0/24]... input
address matches this rule
Matched to load balance group [group_1] the group has policy [ROUNDROBIN]
number of addresses [2]
(0) LB Address: [10.20.100.247]:5433
(1) LB Address: [10.20.100.248]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [10.20.100.248]
port [5433]

(1 row)

2.2 - GET_CLIENT_LABEL

Returns the client connection label for the current session.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

GET_CLIENT_LABEL()

Privileges

None

Examples

Return the current client connection label:

=> SELECT GET_CLIENT_LABEL();
   GET_CLIENT_LABEL
-----------------------
 data_load_application
(1 row)

2.3 - RESET_LOAD_BALANCE_POLICY

Resets the counter each host in the cluster maintains, to track which host it will refer a client to when the native connection load balancing scheme is set to ROUNDROBIN.

Resets the counter each host in the cluster maintains, to track which host it will refer a client to when the native connection load balancing scheme is set to ROUNDROBIN. To reset the counter, run this function on all cluster nodes.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

RESET_LOAD_BALANCE_POLICY()

Privileges

Superuser

Examples

=> SELECT RESET_LOAD_BALANCE_POLICY();

                        RESET_LOAD_BALANCE_POLICY
-------------------------------------------------------------------------
Successfully reset stateful client load balance policies: "roundrobin".
(1 row)

2.4 - SET_CLIENT_LABEL

Assigns a label to a client connection for the current session.

Assigns a label to a client connection for the current session. You can use this label to distinguish client connections.

Labels appear in the v_monitor.sessions table. However, only certain Data collector tables show new client labels set by SET_CLIENT_LABEL. For example, DC_REQUESTS_ISSUED reflects changes by SET_CLIENT_LABEL, while DC_SESSION_STARTS, which collects login data before SET_CLIENT_LABEL can be run, does not.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SET_CLIENT_LABEL('label-name')

Parameters

label-name: VARCHAR name assigned to the client connection label.

Privileges

None

Examples

Assign label data_load_application to the current client connection:

=> SELECT SET_CLIENT_LABEL('data_load_application');
             SET_CLIENT_LABEL
-------------------------------------------
 client_label set to data_load_application
(1 row)

2.5 - SET_LOAD_BALANCE_POLICY

Sets how native connection load balancing chooses a host to handle a client connection.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SET_LOAD_BALANCE_POLICY('policy')

Parameters

policy

The name of the load balancing policy to use, one of the following:

NONE (default): Disables native connection load balancing.
ROUNDROBIN: Chooses the next host from a circular list of hosts in the cluster that are up—for example, in a three-node cluster, iterates over node1, node2, and node3, then wraps back to node1. Each host in the cluster maintains its own pointer to the next host in the circular list, rather than there being a single cluster-wide state.
RANDOM: Randomly chooses a host from among all hosts in the cluster that are up.

Note

Even if the load balancing policy is set on the server to something other than NONE, clients must indicate they want their connections to be load balanced by setting a connection property.

Privileges

Superuser

Examples

The following example demonstrates enabling native connection load balancing on the server by setting the load balancing scheme to ROUNDROBIN:

=> SELECT SET_LOAD_BALANCE_POLICY('ROUNDROBIN');
                  SET_LOAD_BALANCE_POLICY
--------------------------------------------------------------------------------
Successfully changed the client initiator load balancing policy to: roundrobin
(1 row)

3 - Cloud management functions

This section contains functions for managing cloud integrations.

This section contains functions for managing cloud integrations. See also Hadoop functions for HDFS and AWS library functions for the S3 Export UDx.

3.1 - AZURE_TOKEN_CACHE_CLEAR

Clears the cached access token for Azure.

Clears the cached access token for Azure. Call this function after changing the configuration of Azure managed identities.

An Azure object store can support and manage multiple identities. If multiple identities are in use, Vertica looks for an Azure tag with a key of VerticaManagedIdentityClientId, the value of which must be the client_id attribute of the managed identity to be used. If the Azure configuration changes, use this function to clear the cache.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

AZURE_TOKEN_CACHE_CLEAR ( )

Privileges

Superuser

4 - Cluster management functions

This section contains functions that manage deployment on large, distributed database clusters.

This section contains functions that manage spread deployment on large, distributed database clusters.

4.1 - REALIGN_CONTROL_NODES

Causes Vertica to re-evaluate which nodes in the cluster or subcluster are and which nodes are assigned to them as dependents when large cluster is enabled.

Causes Vertica to re-evaluate which nodes in the cluster or subcluster are control nodes and which nodes are assigned to them as dependents when large cluster is enabled. Call this function after altering fault groups in an Enterprise Mode database, or changing the number of control nodes in either database mode. After calling this function, query the V_CATALOG.CLUSTER_LAYOUT system table to see the proposed new layout for nodes in the cluster. You must also take additional steps before the new control node assignments take effect. See Changing the number of control nodes and realigning for details.

Note

In Vertica versions prior to 10.0.1, control node assignments weren't restricted to be within the same Eon Mode subcluster. If you attempt to realign control nodes in a subcluster whose control nodes have dependents in other subclusters, this function returns an error. In this case, you must realign the control nodes in those other subclusters first. Realigning the other subclusters fixes the cross-subcluster dependencies, allowing you to realign the control nodes in the original subcluster you attempted to realign.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

In Enterprise Mode:

REALIGN_CONTROL_NODES()

In Eon Mode:

REALIGN_CONTROL_NODES('subcluster_name')

Parameters

subcluster_name: The name of the subcluster where you want to realign control nodes. Only the nodes in this subcluster are affected. Other subclusters are unaffected. Only allowed when the database is running in Eon Mode.

Privileges

Examples

In an Enterprise Mode database, choose control nodes from all nodes and assign the remaining nodes to a control node:

=> SELECT REALIGN_CONTROL_NODES();

In an Eon Mode database, re-evaluate the control node assignments in the subcluster named analytics:

=> SELECT REALIGN_CONTROL_NODES('analytics');

4.2 - REBALANCE_CLUSTER

Rebalances the database cluster synchronously as a session foreground task.

Rebalances the database cluster synchronously as a session foreground task. REBALANCE_CLUSTER returns only after the rebalance operation is complete. If the current session ends, the operation immediately aborts. To rebalance the cluster as a background task, call START_REBALANCE_CLUSTER.

On large cluster arrangements, you typically call REBALANCE_CLUSTER in a flow (see Changing the number of control nodes and realigning). After you change the number and distribution of control nodes (spread hosts), run REBALANCE_CLUSTER to achieve fault tolerance.

For detailed information about rebalancing tasks, see Rebalancing data across nodes.

Tip

By default, before performing a rebalance, Vertica queries system tables to compute the size of all projections involved in the rebalance task. This query can add significant overhead to the rebalance operation. To disable this query, set projection configuration parameter RebalanceQueryStorageContainers to 0.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

REBALANCE_CLUSTER()

Privileges

Superuser

Examples

=> SELECT REBALANCE_CLUSTER();
REBALANCE_CLUSTER
-------------------
 REBALANCED
(1 row)

4.3 - RELOAD_SPREAD

Updates cluster changes to the catalog's Spread configuration file.

Updates cluster changes to the catalog's Spread configuration file. These changes include:

New or realigned control nodes
New Spread hosts or fault group
New or dropped cluster nodes

This function is often used in a multi-step process for large and elastic cluster arrangements. Calling it might require you to restart the database. You must then rebalance the cluster to realize fault tolerance. For details, see Defining and Realigning Control Nodes.

Caution

In an Eon Mode database, using this function could result in the database becoming read-only. Nodes may become disconnected after you call this function. If the database no longer has primary shard coverage without these nodes, it goes into read-only mode to maintain data integrity. Once the nodes rejoin the cluster, the database will resume normal operation. See Maintaining Shard Coverage.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

RELOAD_SPREAD( true )

Parameters

true: Updates cluster changes related to control message responsibilities to the Spread configuration file.

Privileges

Examples

Update the cluster with changes to control messaging:

=> SELECT reload_spread(true);
 reload_spread
---------------
 reloaded
(1 row)

4.4 - SET_CONTROL_SET_SIZE

Sets the number of that participate in the spread service when large cluster is enabled.

Sets the number of control nodes that participate in the spread service when large cluster is enabled. If the database is running in Enterprise Mode, this function sets the number of control nodes for the entire database cluster. If the database is running in Eon Mode, this function sets the number of control nodes in the subcluster you specify. See Large cluster for more information.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

In Enterprise Mode:

SET_CONTROL_SET_SIZE( control_nodes )

In Eon Mode:

SET_CONTROL_SET_SIZE('subcluster_name', control_nodes )

Parameters

subcluster_name

The name of the subcluster where you want to set the number of control nodes. Only allowed when the database is running in Eon Mode.

control_nodes

The number of control nodes to assign to the cluster (when in Enterprise Mode) or subcluster (when in Eon Mode). Value can be one of the following:

Positive integer value: Vertica assigns the number of control nodes you specify to the cluster or subcluster. This value can be larger than the current node count. This value cannot be larger than 120 (the maximum number of control nodes for a database). In Eon Mode, the total of this value plus the number of control nodes set for all other subclusters cannot be more than 120.
-1: Makes every node in the cluster or subcluster into control nodes. This value effectively disables large cluster for the cluster or subcluster.

Privileges

Examples

In an Enterprise Mode database, set the number of control nodes for the entire cluster to 5:

=> SELECT set_control_set_size(5);
 SET_CONTROL_SET_SIZE
----------------------
 Control size set
(1 row)

5 - Cluster scaling functions

This section contains functions that control how the cluster organizes data for rebalancing.

5.1 - CANCEL_REBALANCE_CLUSTER

Stops any rebalance task that is currently in progress or is waiting to execute.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CANCEL_REBALANCE_CLUSTER()

Privileges

Superuser

Examples

=> SELECT CANCEL_REBALANCE_CLUSTER();
 CANCEL_REBALANCE_CLUSTER
--------------------------
 CANCELED
(1 row)

5.2 - DISABLE_LOCAL_SEGMENTS

Disables local data segmentation, which breaks projections segments on nodes into containers that can be easily moved to other nodes.

Disables local data segmentation, which breaks projections segments on nodes into containers that can be easily moved to other nodes. See Local data segmentation for details.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DISABLE_LOCAL_SEGMENTS()

Privileges

Superuser

Examples

=> SELECT DISABLE_LOCAL_SEGMENTS();
 DISABLE_LOCAL_SEGMENTS
------------------------
 DISABLED
(1 row)

5.3 - ENABLE_ELASTIC_CLUSTER

Enables elastic cluster scaling, which makes enlarging or reducing the size of your database cluster more efficient by segmenting a node's data into chunks that can be easily moved to other hosts.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

ENABLE_ELASTIC_CLUSTER()

Privileges

Superuser

Examples

=> SELECT ENABLE_ELASTIC_CLUSTER();
 ENABLE_ELASTIC_CLUSTER
------------------------
 ENABLED
(1 row)

5.4 - ENABLE_LOCAL_SEGMENTS

Enables local storage segmentation, which breaks projections segments on nodes into containers that can be easily moved to other nodes.

Enables local storage segmentation, which breaks projections segments on nodes into containers that can be easily moved to other nodes. See Local data segmentation for more information.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

ENABLE_LOCAL_SEGMENTS()

Privileges

Superuser

Examples

=> SELECT ENABLE_LOCAL_SEGMENTS();
 ENABLE_LOCAL_SEGMENTS
-----------------------
 ENABLED
(1 row)

5.5 - SET_SCALING_FACTOR

Sets the scaling factor that determines the number of storage containers used when rebalancing the database and when using local data segmentation is enabled.

Sets the scaling factor that determines the number of storage containers used when rebalancing the database and when using local data segmentation is enabled. See Cluster Scaling for details.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SET_SCALING_FACTOR( factor )

Parameters

factor: An integer value between 1 and 32. Vertica uses this value to calculate the number of storage containers each projection is broken into when rebalancing or when local data segmentation is enabled.

Privileges

Superuser

Best practices

The scaling factor determines the number of storage containers that Vertica uses to store each projection across the database during rebalancing when local segmentation is enabled. When setting the scaling factor, follow these guidelines:

The number of storage containers should be greater than or equal to the number of partitions multiplied by the number of local segments:

num-storage-containers >= ( num-partitions * num-local-segments )
Set the scaling factor high enough so rebalance can transfer local segments to satisfy the skew threshold, but small enough so the number of storage containers does not result in too many ROS containers, and cause ROS pushback. The maximum number of ROS containers (by default 1024) is set by configuration parameter ContainersPerProjectionLimit.

Examples

=> SELECT SET_SCALING_FACTOR(12);
 SET_SCALING_FACTOR
--------------------
 SET
(1 row)

5.6 - START_REBALANCE_CLUSTER

Asynchronously rebalances the database cluster as a background task.

Asynchronously rebalances the database cluster as a background task. This function returns immediately after the rebalancing operation is complete. Rebalancing persists until the operation is complete, even if you close the current session or the database shuts down. In the case of shutdown, rebalancing resumes after the cluster restarts. To stop the rebalance operation, call CANCEL_REBALANCE_CLUSTER.

For detailed information about rebalancing tasks, see Rebalancing data across nodes.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

START_REBALANCE_CLUSTER()

Privileges

Superuser

Examples

=> SELECT START_REBALANCE_CLUSTER();
 START_REBALANCE_CLUSTER
-------------------------
 REBALANCING
(1 row)

6 - Communications functions

This section contains communication functions specific to Vertica.

6.1 - NOTIFY

Sends a specified message to a NOTIFIER.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

NOTIFY ( 'message', 'notifier', 'target-topic' )

Parameters

message

The message to send to the endpoint.

notifier

The name of the NOTIFIER.

target-topic

String that specifies one of the following based on the notifier type:

Kafka: The name of an existing destination Kafka topic for the message.

Note
If the topic doesn't already exist, you can configure your Kafka broker to automatically create the specified topic. For more information, see the Kafka documentataion.
Syslog: The ProblemDescription subject and channel value.

Privileges

Superuser

Examples

Send a message to confirm that an ETL job is complete:

=> SELECT NOTIFY('ETL Done!', 'my_notifier', 'DB_activity_topic');

7 - Constraint management functions

This section contains constraint management functions specific to Vertica.

See also SQL system table V_CATALOG.TABLE_CONSTRAINTS.

7.1 - ANALYZE_CONSTRAINTS

Analyzes and reports on constraint violations within the specified scope.

Analyzes and reports on constraint violations within the specified scope

You can enable automatic enforcement of primary key, unique key, and check constraints when INSERT, UPDATE, MERGE, or COPY statements execute. Alternatively, you can use ANALYZE_CONSTRAINTS to validate constraints after issuing these statements. Refer to Constraint enforcement for more information.

ANALYZE_CONSTRAINTS performs a lock in the same way that SELECT * FROM t1 holds a lock on table t1. See LOCKS for additional information.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

ANALYZE_CONSTRAINTS ('[[[database.]schema.]table ]' [, 'column[,...]'] )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
*table*: Identifies the table to analyze. If you omit specifying a schema, Vertica uses the current schema search path. If set to an empty string, Vertica analyzes all tables in the current schema.
column: The column in table to analyze. You can specify multiple comma-delimited columns. Vertica narrows the scope of the analysis to the specified columns. If you omit specifying a column, Vertica analyzes all columns in table.

Privileges

Schema: USAGE
Table: SELECT

Detecting constraint violations during a load process

Vertica checks for constraint violations when queries are run, not when data is loaded. To detect constraint violations as part of the load process, use a COPY statement with the NO COMMIT option. By loading data without committing it, you can run a post-load check of your data using the ANALYZE_CONSTRAINTS function. If the function finds constraint violations, you can roll back the load because you have not committed it.

If ANALYZE_CONSTRAINTS finds violations, such as when you insert a duplicate value into a primary key, you can correct errors using the following functions. Effects last until the end of the session only:

Important

If a check constraint SQL expression evaluates to an unknown for a given row because a column within the expression contains a null, the row passes the constraint condition.

Return values

ANALYZE_CONSTRAINTS returns results in a structured set (see table below) that lists the schema name, table name, column name, constraint name, constraint type, and the column values that caused the violation.

If the result set is empty, then no constraint violations exist; for example:

> SELECT ANALYZE_CONSTRAINTS ('public.product_dimension', 'product_key');
Schema Name | Table Name | Column Names | Constraint Name | Constraint Type | Column Values
-------------+------------+--------------+-----------------+-----------------+---------------
(0 rows)

The following result set shows a primary key violation, along with the value that caused the violation ('10'):

=> SELECT ANALYZE_CONSTRAINTS ('');
Schema Name | Table Name | Column Names | Constraint Name | Constraint Type | Column Values
-------------+------------+--------------+-----------------+-----------------+---------------
store         t1           c1             pk_t1             PRIMARY           ('10')
(1 row)

The result set columns are described in further detail in the following table:

Column Name	Data Type	Description
`Schema Name`	VARCHAR	The name of the schema.
`Table Name`	VARCHAR	The name of the table, if specified.
`Column Names`	VARCHAR	A list of comma-delimited columns that contain constraints.
`Constraint Name`	VARCHAR	The given name of the primary key, foreign key, unique, check, or not null constraint, if specified.
`Constraint Type`	VARCHAR	Identified by one of the following strings: `PRIMARY KEY` `FOREIGN KEY` `UNIQUE` `CHECK` `NOT NULL`
`Column Values`	VARCHAR	Value of the constraint column, in the same order in which `Column Names` contains the value of that column in the violating row. When interpreted as SQL, the value of this column forms a list of values of the same type as the columns in `Column Names`; for example: `('1'),` `('1', 'z')`

Examples

See Detecting constraint violations.

7.2 - ANALYZE_CORRELATIONS

This function is deprecated and will be removed in a future release.

Deprecated

This function is deprecated and will be removed in a future release.

Analyzes the specified tables for pairs of columns that are strongly correlated. ANALYZE_CORRELATIONS stores the 20 pairs with the strongest correlation. ANALYZE_CORRELATIONS also analyzes statistics.

ANALYZE_CORRELATIONS analyzes only pairwise single-column correlations.

For example, state name and country name columns are strongly correlated because the city name usually, but perhaps not always, identifies the state name. The city of Conshohoken is uniquely associated with Pennsylvania, while the city of Boston exists in Georgia, Indiana, Kentucky, New York, Virginia, and Massachusetts. In this case, city name is strongly correlated with state name.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

ANALYZE_CORRELATIONS ('[[[database.]schema.]table ]' [, 'recalculate'] )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
*table-name*: Identifies the table to analyze. If you omit specifying a schema, Vertica uses the current schema search path. If set to an empty string, Vertica analyzes all tables in the current schema.
recalculate: Boolean that specifies whether to analyze correlated columns that were previously analyzed.

Note
Column correlation analysis typically needs to be done only once.

Default: false

Privileges

One of the following:

Superuser
User with USAGE privilege on the design schema

Examples

In the following example, ANALYZE_CORRELATIONS analyzes column correlations for all tables in the public schema, even if they currently exist:

=> SELECT ANALYZE_CORRELATIONS ('public.*', 'true');
 ANALYZE_CORRELATIONS
----------------------
                    0
(1 row)

7.3 - DISABLE_DUPLICATE_KEY_ERROR

Disables error messaging when Vertica finds duplicate primary or unique key values at run time (for use with key constraints that are not automatically enabled).

Disables error messaging when Vertica finds duplicate primary or unique key values at run time (for use with key constraints that are not automatically enabled). Queries execute as though no constraints are defined on the schema. Effects are session scoped.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DISABLE_DUPLICATE_KEY_ERROR();

Privileges

Superuser

Examples

When you call DISABLE_DUPLICATE_KEY_ERROR, Vertica issues warnings letting you know that duplicate values will be ignored, and incorrect results are possible. DISABLE_DUPLICATE_KEY_ERROR is for use only for key constraints that are not automatically enabled.

=> select DISABLE_DUPLICATE_KEY_ERROR();
WARNING 3152:  Duplicate values in columns marked as UNIQUE will now be ignored for the remainder of your session or until reenable_duplicate_key_error() is called
WARNING 3539:  Incorrect results are possible. Please contact Vertica Support if unsure
 disable_duplicate_key_error
------------------------------
 Duplicate key error disabled
(1 row)

7.4 - LAST_INSERT_ID

Returns the last value of an AUTO_INCREMENT/IDENTITY column.

Returns the last value of an AUTO_INCREMENT/IDENTITY column. If multiple sessions concurrently load the same table with an AUTO_INCREMENT/IDENTITY column, the function returns the last value generated for that column.

Note

This function works only with AUTO_INCREMENT/IDENTITY columns. It does not work with named sequences.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

LAST_INSERT_ID()

Privileges

Table owner
USAGE privileges on the table schema

Examples

See AUTO_INCREMENT and IDENTITY sequences.

7.5 - REENABLE_DUPLICATE_KEY_ERROR

Restores the default behavior of error reporting by reversing the effects of DISABLE_DUPLICATE_KEY_ERROR.

Restores the default behavior of error reporting by reversing the effects of DISABLE_DUPLICATE_KEY_ERROR. Effects are session-scoped.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

REENABLE_DUPLICATE_KEY_ERROR();

Privileges

Superuser

Examples

=> SELECT REENABLE_DUPLICATE_KEY_ERROR();
 REENABLE_DUPLICATE_KEY_ERROR
------------------------------
 Duplicate key error enabled
(1 row)

8 - Data collector functions

The Vertica Data Collector is a utility that extends system table functionality by providing a framework for recording events.

The Vertica Data Collector is a utility that extends system table functionality by providing a framework for recording events. It gathers and retains monitoring information about your database cluster and makes that information available in system tables, requiring few configuration parameter tweaks, and having negligible impact on performance.

Collected data is stored on disk in the DataCollector directory under the Vertica /catalog path. You can use the information the Data Collector retains to query the past state of system tables and extract aggregate information, as well as do the following:

See what actions users have taken
Locate performance bottlenecks
Identify potential improvements to Vertica configuration

Data Collector works in conjunction with an advisor tool called Workload Analyzer, which intelligently monitors the performance of SQL queries and workloads and recommends tuning actions based on observations of the actual workload history.

By default, Data Collector is on and retains information for all sessions. If performance issues arise, a superuser can disable Data Collector by setting set configuration parameter EnableDataCollector to 0.

8.1 - CLEAR_DATA_COLLECTOR

Clears all memory and disk records from Data Collector tables and logs, and resets collection statistics in system table DATA_COLLECTOR.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CLEAR_DATA_COLLECTOR( [ 'component' ] )

Parameters

component

Clears memory and disk records for the specified component. If you provide no argument, the function clears memory and disk records for all components.

Query system table DATA_COLLECTOR for component names. For example:

=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
   component    |          description
----------------+-------------------------------
 DepotEvictions | Files evicted from the Depot
 DepotFetches   | Files fetched to the Depot
 DepotUploads   | Files Uploaded from the Depot
(3 rows)

Privileges

Superuser

Examples

The following command clears memory and disk records for the ResourceAcquisitions component:

=> SELECT clear_data_collector('ResourceAcquisitions');
 clear_data_collector
----------------------
 CLEAR
(1 row)

The following command clears data collection for all components:

=> SELECT clear_data_collector();
 clear_data_collector
----------------------
 CLEAR
(1 row)

8.2 - DATA_COLLECTOR_HELP

Returns online usage instructions about the Data Collector, the V_MONITOR.DATA_COLLECTOR system table, and the Data Collector control functions.

Returns online usage instructions about the Data Collector, the DATA_COLLECTOR system table, and the Data Collector control functions.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DATA_COLLECTOR_HELP()

Privileges

None

Returns

The DATA_COLLECTOR_HELP() function returns the following information:

=> SELECT DATA_COLLECTOR_HELP();

-----------------------------------------------------------------------------
Usage Data Collector
The data collector retains history of important system activities.
   This data can be used as a reference of what actions have been taken
      by users, but it can also be used to locate performance bottlenecks,
      or identify potential improvements to the Vertica configuration.
   This data is queryable via Vertica system tables.
Acccess a list of data collector components, and some statistics, by running:
   SELECT * FROM v_monitor.data_collector;

The amount of data retained by size and time can be controlled with several
functions.
   To just set the size amount:
      set_data_collector_policy(<component>,
                                <memory retention (KB)>,
                                <disk retention (KB)>);

   To set both the size and time amounts (the smaller one will dominate):
      set_data_collector_policy(<component>,
                                <memory retention (KB)>,
                                <disk retention (KB)>,
                                <interval>);

   To set just the time amount:
      set_data_collector_time_policy(<component>,
                                     <interval>);

   To set the time amount for all tables:
      set_data_collector_time_policy(<interval>);

The current retention policy for a component can be queried with:
   get_data_collector_policy(<component>);

Data on disk is kept in the "DataCollector" directory under the Vertica
\catalog path. This directory also contains instructions on how to load
the monitoring data into another Vertica database.

To move the data collector logs and instructions to other storage locations,
create labeled storage locations using add_location and then use:

   set_data_collector_storage_location(<storage_label>);

Additional commands can be used to configure the data collection logs.
The log can be cleared with:
clear_data_collector([<optional component>]);
The log can be synchronized with the disk storage using:
flush_data_collector([<optional component>]);

8.3 - FLUSH_DATA_COLLECTOR

Waits until memory logs are moved to disk and then flushes the Data Collector, synchronizing the log with disk storage.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

FLUSH_DATA_COLLECTOR( [ 'component' ] )

Parameters

component

Flushes data for the specified component. If you omit this argument, the function flushes data for all components.

Query system table DATA_COLLECTOR for component names. For example:

=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
   component    |          description
----------------+-------------------------------
 DepotEvictions | Files evicted from the Depot
 DepotFetches   | Files fetched to the Depot
 DepotUploads   | Files Uploaded from the Depot
(3 rows)

Privileges

Superuser

Examples

The following command flushes the Data Collector for the ResourceAcquisitions component:

=> SELECT flush_data_collector('ResourceAcquisitions');
 flush_data_collector
----------------------
 FLUSH
(1 row)

The following command flushes data collection for all components:

=> SELECT flush_data_collector();
 flush_data_collector
----------------------
 FLUSH
(1 row)

8.4 - GET_DATA_COLLECTOR_NOTIFY_POLICY

Lists any notification policies set on a component.

Lists any notification policies set on a Data collector component.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

GET_DATA_COLLECTOR_NOTIFY_POLICY('component')

component

Name of the Data Collector component to check for notification policies.

Query system table DATA_COLLECTOR for component names. For example:

=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
   component    |          description
----------------+-------------------------------
 DepotEvictions | Files evicted from the Depot
 DepotFetches   | Files fetched to the Depot
 DepotUploads   | Files Uploaded from the Depot
(3 rows)

Examples

=> SELECT GET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures');
                   GET_DATA_COLLECTOR_NOTIFY_POLICY
----------------------------------------------------------------------
 Notifiable;  Notifier: vertica_stats; Channel: vertica_notifications
(1 row)

The following example shows the output from the function when there is no notification policy for the component:


=> SELECT GET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures');
 GET_DATA_COLLECTOR_NOTIFY_POLICY
----------------------------------
 Not notifiable;
(1 row)

8.5 - GET_DATA_COLLECTOR_POLICY

Retrieves a brief statement about the retention policy for the specified component.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

GET_DATA_COLLECTOR_POLICY( 'component' )

Parameters

component

Returns the retention policy of the specified component.

Query system table DATA_COLLECTOR for component names. For example:

=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
   component    |          description
----------------+-------------------------------
 DepotEvictions | Files evicted from the Depot
 DepotFetches   | Files fetched to the Depot
 DepotUploads   | Files Uploaded from the Depot
(3 rows)

Privileges

None

Examples

The following query returns the history of all resource acquisitions by specifying the ResourceAcquisitions component:

=> SELECT get_data_collector_policy('ResourceAcquisitions');
          get_data_collector_policy
----------------------------------------------
 1000KB kept in memory, 10000KB kept on disk.
(1 row)

8.6 - SET_DATA_COLLECTOR_NOTIFY_POLICY

Creates/enables notification policies for a component.

Creates/enables notification policies for a Data collector component. Notification policies automatically send messages to the specified NOTIFIER when certain events occur.

To view existing notification policies on a Data Collector component, see GET_DATA_COLLECTOR_NOTIFY_POLICY.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SET_DATA_COLLECTOR_NOTIFY_POLICY('component','notifier', 'topic', enabled)

component

Name of the component whose change will be reported via the notifier.

Query system table DATA_COLLECTOR for component names. For example:

=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
   component    |          description
----------------+-------------------------------
 DepotEvictions | Files evicted from the Depot
 DepotFetches   | Files fetched to the Depot
 DepotUploads   | Files Uploaded from the Depot
(3 rows)

notifier

Name of the notifier that will send the message.

topic

One of the following:

Kafka: The name of the Kafka topic that will receive the notification message.

Note
If the topic doesn't already exist, you can configure your Kafka broker to automatically create the specified topic. For more information, see the Kafka documentataion.
Syslog: The subject of the field ProblemDescription.

enabled

Boolean value that specifies whether this policy is enabled. Set to TRUE to enable reporting component changes. Set to FALSE to disable the notifier.

Examples

Kafka notifier

To be notified of failed login attempts, you can create a notifier that sends a notification when the DC component LoginFailures updates. The TLSMODE 'verify-ca' verifies that the server's certificate is signed by a trusted CA.

=> CREATE NOTIFIER vertica_stats ACTION 'kafka://kafka01.example.com:9092' MAXMEMORYSIZE '10M' TLSMODE 'verify-ca';
CREATE NOTIFIER
=> SELECT SET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures','vertica_stats', 'vertica_notifications', true);
SET_DATA_COLLECTOR_NOTIFY_POLICY
----------------------------------
 SET
(1 row)

The following example shows how to disable the policy created in the previous example:

=> SELECT SET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures','vertica_stats', 'vertica_notifications', false);
 SET_DATA_COLLECTOR_NOTIFY_POLICY
----------------------------------
 SET
(1 row)


=> SELECT GET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures');
 GET_DATA_COLLECTOR_NOTIFY_POLICY
----------------------------------
 Not notifiable;
(1 row)

Syslog notifier

The following example creates a notifier that writes a message to syslog when the Data collector (DC) component LoginFailures updates:

Enable syslog notifiers for the current database:

=> ALTER DATABASE DEFAULT SET SyslogEnabled = 1;

Create and enable a syslog notifier v_syslog_notifier:

=> CREATE NOTIFIER v_syslog_notifier ACTION 'syslog'
    ENABLE
    MAXMEMORYSIZE '10M'
    IDENTIFIED BY 'f8b0278a-3282-4e1a-9c86-e0f3f042a971'
    PARAMETERS 'eventSeverity = 5';

Configure the syslog notifier v_syslog_notifier for updates to the LoginFailures DC component with SET_DATA_COLLECTOR_NOTIFY_POLICY:

=> SELECT SET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures','v_syslog_notifier', 'Login failed!', true);

This notifier writes the following message to syslog (default location: /var/log/messages) when a user fails to authenticate as the user Bob:

Apr 25 16:04:58
vertica_host_01
vertica:
    Event Posted:
        Event Code:21
        Event Id:0
        Event Severity: Notice [5]
        PostedTimestamp: 2022-04-25 16:04:58.083063
        ExpirationTimestamp: 2022-04-25 16:04:58.083063
        EventCodeDescription: Notifier
        ProblemDescription: (Login failed!)
    {
       "_db":"VMart",
       "_schema":"v_internal",
       "_table":"dc_login_failures",
       "_uuid":"f8b0278a-3282-4e1a-9c86-e0f3f042a971",
       "authentication_method":"Reject",
       "client_authentication_name":"default: Reject",
       "client_hostname":"::1",
       "client_label":"",
       "client_os_user_name":"dbadmin",
       "client_pid":523418,
       "client_version":"",
       "database_name":"dbadmin",
       "effective_protocol":"3.8",
       "node_name":"v_vmart_node0001",
       "reason":"REJECT",
       "requested_protocol":"3.8",
       "ssl_client_fingerprint":"",
       "ssl_client_subject":"",
       "time":"2022-04-25 16:04:58.082568-05",
       "user_name":"Bob"
    }#012
    DatabaseName: VMart
    Hostname: vertica_host_01

8.7 - SET_DATA_COLLECTOR_POLICY

Updates the following retention policy properties for the specified component:.

Updates the following retention policy properties for the specified component:

MEMORY_BUFFER_SIZE_KB
DISK_SIZE_KB
INTERVAL_TIME

Before you change a retention policy, you can view its current settings by querying system table DATA_COLLECTOR or by calling meta-function GET_DATA_COLLECTOR_POLICY.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SET_DATA_COLLECTOR_POLICY('component', 'memory-buffer-size', 'disk-size' [,'interval-time']  )

Parameters

component

Specifies the retention policy to update.

Query system table DATA_COLLECTOR for component names. For example:

=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
   component    |          description
----------------+-------------------------------
 DepotEvictions | Files evicted from the Depot
 DepotFetches   | Files fetched to the Depot
 DepotUploads   | Files Uploaded from the Depot
(3 rows)

memory-buffer-size

Specifies in kilobytes the maximum amount of data that is buffered in memory before moving it to disk. The policy retention policy property MEMORY_BUFFER_SIZE_KB is set from this value.

Caution

If you set this parameter to 0, the function returns with a warning that the Data Collector cannot retain any data for this component in memory or on disk.

Consider setting this parameter to a high value in the following cases:

Unusually high levels of data collection. If memory-buffer-size is set too low, the Data Collector might be unable to flush buffered data to disk fast enough to keep up with the activity level, which can lead to loss of in-memory data.
Very large data collector records—for example, records with very long query strings. The Data Collector uses double-buffering, so it cannot retain in memory records that are more than 50 percent larger than memory-buffer-size.

disk-size

Specifies in kilobytes the maximum disk space allocated for this component's Data Collector table. The policy retention policy property DISK_SIZE_KB is set from this value. If set to 0, the Data Collector retains only as much component data as it can buffer in memory, as specified by memory-buffer-size.

interval-time

INTERVAL data type that specifies how long data of a given component is retained in that component's Data Collector table. The retention policy property INTERVAL_TIME is set from this value. If you set this parameter to a positive value, it also changes the policy property INTERVAL_SET to t (true).

For example, if you specify component TupleMoverEvents and set interval-time to an interval of two days ('2 days'::interval), the Data Collector table dc_tuple_mover_events retains records of Tuple Mover activity over the last 48 hours. Older Tuple Mover data are automatically dropped from this table.

Note

Setting a component's policy's INTERVAL_TIME property has no effect on how much data storage the Data Collector retains on disk for that component. Maximum disk storage capacity is determined by the DISK_SIZE_KB property. Setting the INTERVAL_TIME property only affects how long data is retained by the component's Data Collector table. For details, see Configuring data retention policies.

To disable the INTERVAL_TIME policy property, set this parameter to a negative integer. Doing so reverts two retention policy properties to their default settings:

INTERVAL_SET: f
INTERVAL_TIME: 0

With these two properties thus set, the component's Data Collector table retains data on all component events until it reaches its maximum limit, as set by retention policy property DISK_SIZE_KB.

Privileges

Superuser

Examples

See Configuring data retention policies.

8.8 - SET_DATA_COLLECTOR_TIME_POLICY

Updates the retention policy property INTERVAL_TIME for the specified component.

Updates the retention policy property INTERVAL_TIME for the specified component. Calling this function has no effect on other properties of the same component. You can use this function to update the INTERVAL_TIME property of all component retention policies.

To set other retention policy properties, call SET_DATA_COLLECTOR_POLICY.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SET_DATA_COLLECTOR_TIME_POLICY( ['component',] 'interval-time' )

Parameters

component

Specifies the retention policy to update. If you omit this argument, Vertica updates the retention policy of all Data Collector components.

Query system table DATA_COLLECTOR for component names. For example:

=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
   component    |          description
----------------+-------------------------------
 DepotEvictions | Files evicted from the Depot
 DepotFetches   | Files fetched to the Depot
 DepotUploads   | Files Uploaded from the Depot
(3 rows)

interval-time

Note

To disable the INTERVAL_TIME policy property, set this parameter to a negative integer. Doing so reverts two retention policy properties to their default settings:

INTERVAL_SET: f
INTERVAL_TIME: 0

With these two properties thus set, the component's Data Collector table retains data on all component events until it reaches its maximum limit, as set by retention policy property DISK_SIZE_KB.

Privileges

Superuser

Examples

See Configuring data retention policies.

9 - Database Designer functions

Database Designer functions perform the following operations, generally performed in the following order:

Create a design.
Set design properties.
Populate a design.
Create design and deployment scripts.
Get design data.
Clean up.

Important

You can also use meta-function DESIGNER_SINGLE_RUN, which encapsulates all of these steps with a single call. The meta-function iterates over all queries within a specified timespan, and returns with a design ready for deployment.

For detailed information, see Workflow for running Database Designer programmatically. For information on required privileges, see Privileges for running Database Designer functions

Caution

Before running Database Designer functions on an existing schema, back up the current design by calling EXPORT_CATALOG.

Create a design

DESIGNER_CREATE_DESIGN directs Database Designer to create a design.

Set design properties

The following functions let you specify design properties:

DESIGNER_SET_DESIGN_TYPE	Specifies whether the design is comprehensive or incremental.
DESIGNER_DESIGN_PROJECTION_ENCODINGS	Analyzes encoding in the specified projections and creates a script that implements encoding recommendations.
DESIGNER_SET_DESIGN_KSAFETY	Sets the K-safety value for a comprehensive design.
DESIGNER_SET_OPTIMIZATION_OBJECTIVE	Specifies whether the design optimizes for query or load performance.
DESIGNER_SET_PROPOSE_UNSEGMENTED_PROJECTIONS	Enables inclusion of unsegmented projections in the design.

Populate a design

The following functions let you add tables and queries to your Database Designer design:

DESIGNER_ADD_DESIGN_TABLES	Adds the specified tables to a design.
DESIGNER_ADD_DESIGN_QUERY	Adds queries to the design and weights them.
DESIGNER_ADD_DESIGN_QUERIES
DESIGNER_ADD_DESIGN_QUERIES_FROM_RESULTS

Create design and deployment scripts

The following functions populate the Database Designer workspace and create design and deployment scripts. You can also analyze statistics, deploy the design automatically, and drop the workspace after the deployment:

DESIGNER_RUN_POPULATE_DESIGN_AND_DEPLOY	Populates the design and creates design and deployment scripts.
DESIGNER_WAIT_FOR_DESIGN	Waits for a currently running design to complete.

Reset a design

DESIGNER_RESET_DESIGN discards all the run-specific information of the previous Database Designer build or deployment of the specified design but retains its configuration.

Get design data

The following functions display information about projections and scripts that the Database Designer created:

DESIGNER_OUTPUT_ALL_DESIGN_PROJECTIONS	Sends to standard output DDL statements that define design projections.
DESIGNER_OUTPUT_DEPLOYMENT_SCRIPT	Sends to standard output a design's deployment script.

Clean up

The following functions cancel any running Database Designer operation or drop a Database Designer design and all its contents:

DESIGNER_CANCEL_POPULATE_DESIGN	Cancels population or deployment operation for the specified design if it is currently running.
DESIGNER_DROP_DESIGN	Removes the schema associated with the specified design and all its contents.
DESIGNER_DROP_ALL_DESIGNS	Removes all Database Designer-related schemas associated with the current user.

9.1 - DESIGNER_ADD_DESIGN_QUERIES

Reads and evaluates queries from an input file, and adds the queries that it accepts to the specified design.

Reads and evaluates queries from an input file, and adds the queries that it accepts to the specified design. All accepted queries are assigned a weight of 1.

The following requirements apply:

All queried tables must previously be added to the design with DESIGNER_ADD_DESIGN_TABLES.
If the design type is incremental, the Database Designer reads only the first 100 queries in the input file, and ignores all queries beyond that number.

All accepted queries are added to the system table DESIGN_QUERIES.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DESIGNER_ADD_DESIGN_QUERIES ( 'design-name', 'queries-file' [, return-results] )

Parameters

design-name

Name of the target design.

queries-file

Absolute path and name of the file that contains the queries to evaluate, on the local file system of the node where the session is connected, or another file system or object store that Vertica supports.

return-results

Boolean, optionally specifies whether to return results of the add operation to standard output. If set to true, Database Designer returns the following results:

Number of accepted queries
Number of queries referencing non-design tables
Number of unsupported queries
Number of illegal queries

Privileges

Non-superuser: design creator with all privileges required to execute the queries in input-file.

Errors

Database Designer returns an error in the following cases:

The query contains illegal syntax.
The query references:
- External or system tables only
- Local temporary or other non-design tables
DELETE or UPDATE query has one or more subqueries.
INSERT query does not include a SELECT clause.
Database Designer cannot optimize the query.

Examples

The following example adds queries from vmart_queries.sql to the VMART_DESIGN design. This file contains nine queries. The statement includes a third argument of true, so Database Designer returns results of the add operation:

=> SELECT DESIGNER_ADD_DESIGN_QUERIES ('VMART_DESIGN', '/tmp/examples/vmart_queries.sql', 'true');
...
 DESIGNER_ADD_DESIGN_QUERIES
----------------------------------------------------
 Number of accepted queries                      =9
 Number of queries referencing non-design tables =0
 Number of unsupported queries                   =0
 Number of illegal queries                       =0
(1 row)

9.2 - DESIGNER_ADD_DESIGN_QUERIES_FROM_RESULTS

Executes the specified query and evaluates results in the following columns:.

Executes the specified query and evaluates results in the following columns:

QUERY_TEXT (required): Text of potential design queries.
QUERY_WEIGHT (optional): The weight assigned to each query that indicates its importance relative to other queries, a real number >0 and ≤ 1. Database Designer uses this setting when creating the design to prioritize the query. If DESIGNER_ADD_DESIGN_QUERIES_FROM_RESULTS returns any results that omit this value, Database Designer sets their weight to 1.

After evaluating the queries in QUERY_TEXT, DESIGNER_ADD_DESIGN_QUERIES_FROM_RESULTS adds all accepted queries to the design. An unlimited number of queries can be added to the design.

Before you add queries to a design, you must add the queried tables with DESIGNER_ADD_DESIGN_TABLES.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DESIGNER_ADD_DESIGN_QUERIES_FROM_RESULTS ( 'design-name', 'query' )

Parameters

design-name: Name of the target design.
query: A valid SQL query whose results contain columns named QUERY_TEXT and, optionally, QUERY_WEIGHT.

Privileges

Non-superuser: design creator with all privileges required to execute the specified query, and all queries returned by this function

Errors

Database Designer returns an error in the following cases:

The query contains illegal syntax.
The query references:
- External or system tables only
- Local temporary or other non-design tables
DELETE or UPDATE query has one or more subqueries.
INSERT query does not include a SELECT clause.
Database Designer cannot optimize the query.

Examples

The following example queries the system table QUERY_REQUESTS for all long-running queries (> 1 million microseconds) and adds them to the VMART_DESIGN design. The query returns no information on query weights, so all queries are assigned a weight of 1:

=> SELECT DESIGNER_ADD_DESIGN_QUERIES_FROM_RESULTS ('VMART_DESIGN',
   'SELECT request as query_text FROM query_requests where request_duration_ms > 1000000 AND request_type =
   ''QUERY'';');

9.3 - DESIGNER_ADD_DESIGN_QUERY

Reads and parses the specified query, and if accepted, adds it to the design.

Reads and parses the specified query, and if accepted, adds it to the design. Before you add queries to a design, you must add the queried tables with DESIGNER_ADD_DESIGN_TABLES.

All accepted queries are added to the system table DESIGN_QUERIES.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DESIGNER_ADD_DESIGN_QUERY ( 'design-name', 'design-query' [, query-weight] )

Parameters

design-name: Name of the target design.
design-query: Executable SQL query.
query-weight: Optionally assigns a weight to each query that indicates its importance relative to other queries, a real number >0 and ≤ 1. Database Designer uses this setting to prioritize queries in the design .
If you omit this parameter, Database Designer assigns a weight of 1.

Privileges

Non-superuser: design creator with all privileges required to execute the specified query

Errors

Database Designer returns an error in the following cases:

The query contains illegal syntax.
The query references:
- External or system tables only
- Local temporary or other non-design tables
DELETE or UPDATE query has one or more subqueries.
INSERT query does not include a SELECT clause.
Database Designer cannot optimize the query.

Examples

The following example adds the specified query to the VMART_DESIGN design and assigns that query a weight of 0.5:

=> SELECT DESIGNER_ADD_DESIGN_QUERY (
   'VMART_DESIGN',
   'SELECT customer_name, customer_type FROM customer_dimension ORDER BY customer_name ASC;', 0.5
   );

9.4 - DESIGNER_ADD_DESIGN_TABLES

Adds the specified tables to a design.

Adds the specified tables to a design. You must run DESIGNER_ADD_DESIGN_TABLES before adding design queries to the design. If no tables are added to the design, Vertica does not accept design queries.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DESIGNER_ADD_DESIGN_TABLES ( 'design-name', '[ table-spec[,...] ]' [, 'analyze-statistics'] )

Parameters

design-name

Name of the Database Designer design.

table-spec[,...]

One or more comma-delimited arguments that specify which tables to add to the design, where each table-spec argument can specify tables as follows:

[schema.]table
Add table to the design.
schema.*
Add all tables in schema.

If set to an empty string, Vertica adds all tables in the database to which the user has access.

analyze-statistics

Boolean that optionally specifies whether to run ANALYZE_STATISTICS after adding the specified tables to the design, by default set to false.

Accurate statistics help Database Designer optimize compression and query performance. Updating statistics takes time and resources.

Privileges

Non-superuser: design creator with USAGE privilege on the design table schema and owner of the design table

Examples

The following example adds to design VMART_DESIGN all tables from schemas online_sales and store, and analyzes statistics for those tables:

=> SELECT DESIGNER_ADD_DESIGN_TABLES('VMART_DESIGN', 'online_sales.*, store.*','true');
 DESIGNER_ADD_DESIGN_TABLES
----------------------------
                          7
(1 row)

9.5 - DESIGNER_CANCEL_POPULATE_DESIGN

Cancels population or deployment operation for the specified design if it is currently running.

Cancels population or deployment operation for the specified design if it is currently running. When you cancel a deployment, the Database Designer cancels the projection refresh operation. It does not roll back projections that it already deployed and refreshed.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DESIGNER_CANCEL_POPULATE_DESIGN ( 'design-name' )

Parameters

design-name: Name of the design operation to cancel.

Privileges

Non-superuser: design creator

Examples

The following example cancels a currently running design for VMART_DESIGN and then drops the design:

=> SELECT DESIGNER_CANCEL_POPULATE_DESIGN ('VMART_DESIGN');
=> SELECT DESIGNER_DROP_DESIGN ('VMART_DESIGN', 'true');

9.6 - DESIGNER_CREATE_DESIGN

Creates a design with the specified name.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DESIGNER_CREATE_DESIGN ( 'design-name' )

Parameters

design-name: Name of the design to create, can contain only alphanumeric and underscore (_) characters.
Two users cannot have designs with the same name at the same time.

Privileges

Superuser
DBDUSER with WRITE privileges on storage location of design-name.

Database Designer system views

If any of the following V_MONITOR tables do not already exist from previous designs, DESIGNER_CREATE_DESIGN creates them:

Examples

The following example creates the design VMART_DESIGN:

=> SELECT DESIGNER_CREATE_DESIGN('VMART_DESIGN');
 DESIGNER_CREATE_DESIGN
------------------------
                      0
(1 row)

9.7 - DESIGNER_DESIGN_PROJECTION_ENCODINGS

Analyzes encoding in the specified projections, creates a script to implement encoding recommendations, and optionally deploys the recommendations.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DESIGNER_DESIGN_PROJECTION_ENCODINGS ( '[ proj-spec[,... ] ]', '[destination]' [, 'deploy'] [, 'reanalyze-encodings'] )

Parameters

proj-spec[,...]

One or more comma-delimited projections to add to the design. Each projection can be specified in one of the following ways:

[[schema.]table.]projection
Specifies to analyze projection.
schema.*
Specifies to analyze all projections in the named schema.
[schema.]table
Specifiesto analyze all projections of the named table.

If set to an empty string, Vertica analyzes all projections in the database to which the user has access.

For example, the following statement specifies to analyze all projections in schema private, and send the results to the file encodings.sql:

=> SELECT DESIGNER_DESIGN_PROJECTION_ENCODINGS ('mydb.private.*','encodings.sql');

destination

Specifies where to send output, one of the following:

Empty string ('') writes the script to standard output.
Pathname of a SQL output file. If you specify a file that does not exist, the function creates one. If you specify only a file name, Vertica creates it in the catalog directory. If the file already exists, the function silently overwrites its contents.

deploy

Boolean that specifies whether to deploy encoding changes.

Default: false

reanalyze-encodings

Boolean that specifies whether DESIGNER_DESIGN_PROJECTION_ENCODINGS analyzes encodings in a projection where all columns are already encoded:

false: Analyzes no columns and generates no recommendations if all columns are encoded.
true: Ignores existing encodings and generates recommendations.

Default: false

Privileges

Superuser, or DBDUSER with the following privileges:

OWNER of all projections to analyze
USAGE privilege on the schema for the specified projections

Examples

The following example requests that Database Designer analyze encodings of the table online_sales.call_center_dimension:

The second parameter destination is set to an empty string, so the script is sent to standard output (shown truncated below).
The last two parameters deploy and reanalyze-encodings are omitted, so Database Designer does not execute the script or reanalyze existing encodings:

=> SELECT DESIGNER_DESIGN_PROJECTION_ENCODINGS ('online_sales.call_center_dimension','');

              DESIGNER_DESIGN_PROJECTION_ENCODINGS
----------------------------------------------------------------

CREATE PROJECTION call_center_dimension_DBD_1_seg_EncodingDesign /*+createtype(D)*/
(
 call_center_key ENCODING COMMONDELTA_COMP,
 cc_closed_date,
 cc_open_date,
 cc_name ENCODING ZSTD_HIGH_COMP,
 cc_class ENCODING ZSTD_HIGH_COMP,
 cc_employees,
 cc_hours ENCODING ZSTD_HIGH_COMP,
 cc_manager ENCODING ZSTD_HIGH_COMP,
 cc_address ENCODING ZSTD_HIGH_COMP,
 cc_city ENCODING ZSTD_COMP,
 cc_state ENCODING ZSTD_FAST_COMP,
 cc_region ENCODING ZSTD_HIGH_COMP
)
AS
 SELECT call_center_dimension.call_center_key,
        call_center_dimension.cc_closed_date,
        call_center_dimension.cc_open_date,
        call_center_dimension.cc_name,
        call_center_dimension.cc_class,
        call_center_dimension.cc_employees,
        call_center_dimension.cc_hours,
        call_center_dimension.cc_manager,
        call_center_dimension.cc_address,
        call_center_dimension.cc_city,
        call_center_dimension.cc_state,
        call_center_dimension.cc_region
 FROM online_sales.call_center_dimension
 ORDER BY call_center_dimension.call_center_key
SEGMENTED BY hash(call_center_dimension.call_center_key) ALL NODES KSAFE 1;

select refresh('online_sales.call_center_dimension');

select make_ahm_now();

DROP PROJECTION online_sales.call_center_dimension CASCADE;

ALTER PROJECTION online_sales.call_center_dimension_DBD_1_seg_EncodingDesign RENAME TO call_center_dimension;
(1 row)

9.8 - DESIGNER_DROP_ALL_DESIGNS

Removes all Database Designer-related schemas associated with the current user.

Removes all Database Designer-related schemas associated with the current user. Use this function to remove database objects after one or more Database Designer sessions complete execution.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DESIGNER_DROP_ALL_DESIGNS()

Parameters

None.

Privileges

Non-superuser: design creator

Examples

The following example removes all schema and their contents associated with the current user. DESIGNER_DROP_ALL_DESIGNS returns the number of designs dropped:

=> SELECT DESIGNER_DROP_ALL_DESIGNS();
 DESIGNER_DROP_ALL_DESIGNS
---------------------------
                         2
(1 row)

9.9 - DESIGNER_DROP_DESIGN

Removes the schema associated with the specified design and all its contents.

Removes the schema associated with the specified design and all its contents. Use DESIGNER_DROP_DESIGN after a Database Designer design or deployment completes successfully. You must also use it to drop a design before creating another one under the same name.

To drop all designs that you created, use DESIGNER_DROP_ALL_DESIGNS.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DESIGNER_DROP_DESIGN ( 'design-name' [, force-drop ] )

Parameters

design-name: Name of the design to drop.
force-drop: Boolean that overrides any dependencies that otherwise prevent Vertica from executing this function—for example, the design is in use or is currently being deployed. If you omit this parameter, Vertica sets it to false.

Privileges

Non-superuser: design creator

Examples

The following example deletes the Database Designer design VMART_DESIGN and all its contents:

=> SELECT DESIGNER_DROP_DESIGN ('VMART_DESIGN');

9.10 - DESIGNER_OUTPUT_ALL_DESIGN_PROJECTIONS

Displays the DDL statements that define the design projections to standard output.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DESIGNER_OUTPUT_ALL_DESIGN_PROJECTIONS ( 'design-name' )

Parameters

design-name: Name of the target design.

Privileges

Superuseror DBDUSER

Examples

The following example returns the design projection DDL statements for vmart_design:

=> SELECT DESIGNER_OUTPUT_ALL_DESIGN_PROJECTIONS('vmart_design');
CREATE PROJECTION customer_dimension_DBD_1_rep_VMART_DESIGN /*+createtype(D)*/
(
 customer_key ENCODING DELTAVAL,
 customer_type ENCODING AUTO,
 customer_name ENCODING AUTO,
 customer_gender ENCODING REL,
 title ENCODING AUTO,
 household_id ENCODING DELTAVAL,
 customer_address ENCODING AUTO,
 customer_city ENCODING AUTO,
 customer_state ENCODING AUTO,
 customer_region ENCODING AUTO,
 marital_status ENCODING AUTO,
 customer_age ENCODING DELTAVAL,
 number_of_children ENCODING BLOCKDICT_COMP,
 annual_income ENCODING DELTARANGE_COMP,
 occupation ENCODING AUTO,
 largest_bill_amount ENCODING DELTAVAL,
 store_membership_card ENCODING BLOCKDICT_COMP,
 customer_since ENCODING DELTAVAL,
 deal_stage ENCODING AUTO,
 deal_size ENCODING DELTARANGE_COMP,
 last_deal_update ENCODING DELTARANGE_COMP
)
AS
 SELECT customer_key,
        customer_type,
        customer_name,
        customer_gender,
        title,
        household_id,
        customer_address,
        customer_city,
        customer_state,
        customer_region,
        marital_status,
        customer_age,
        number_of_children,
        annual_income,
        occupation,
        largest_bill_amount,
        store_membership_card,
        customer_since,
        deal_stage,
        deal_size,
        last_deal_update
 FROM public.customer_dimension
 ORDER BY customer_gender,
          annual_income
UNSEGMENTED ALL NODES;
CREATE PROJECTION product_dimension_DBD_2_rep_VMART_DESIGN /*+createtype(D)*/
(
...

9.11 - DESIGNER_OUTPUT_DEPLOYMENT_SCRIPT

Displays the deployment script for the specified design to standard output.

Displays the deployment script for the specified design to standard output. If the design is already deployed, Vertica ignores this function.

To output only the CREATE PROJECTION commands in a design script, use DESIGNER_OUTPUT_ALL_DESIGN_PROJECTIONS.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DESIGNER_OUTPUT_DEPLOYMENT_SCRIPT ( 'design-name' )

Parameters

design-name: Name of the target design.

Privileges

Non-superuser: design creator

Examples

The following example displays the deployment script for VMART_DESIGN:

=> SELECT DESIGNER_OUTPUT_DEPLOYMENT_SCRIPT('VMART_DESIGN');
CREATE PROJECTION customer_dimension_DBD_1_rep_VMART_DESIGN /*+createtype(D)*/
...
CREATE PROJECTION product_dimension_DBD_2_rep_VMART_DESIGN /*+createtype(D)*/
...
select refresh('public.customer_dimension,
                public.product_dimension,
                public.promotion.dimension,
                public.date_dimension');
select make_ahm_now();
DROP PROJECTION public.customer_dimension_super CASCADE;
DROP PROJECTION public.product_dimension_super CASCADE;
...

9.12 - DESIGNER_RESET_DESIGN

Discards all run-specific information of the previous Database Designer build or deployment of the specified design but keeps its configuration.

Discards all run-specific information of the previous Database Designer build or deployment of the specified design but keeps its configuration. You can make changes to the design as needed, for example, by changing parameters or adding additional tables and/or queries, before running the design again.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DESIGNER_RESET_DESIGN ( 'design-name' )

Parameters

design-name: Name of the design to reset.

Privileges

Non-superuser: design creator

Examples

The following example resets the Database Designer design VMART_DESIGN:

=> SELECT DESIGNER_RESET_DESIGN ('VMART_DESIGN');

9.13 - DESIGNER_RUN_POPULATE_DESIGN_AND_DEPLOY

Populates the design and creates the design and deployment scripts.

Populates the design and creates the design and deployment scripts. DESIGNER_RUN_POPULATE_DESIGN_AND_DEPLOY can also analyze statistics, deploy the design, and drop the workspace after the deployment.

The files output by this function have the permissions 666 or rw-rw-rw-, which allows any Linux user on the node to read or write to them. It is highly recommended that you keep the files in a secure directory.

Caution

DESIGNER_RUN_POPULATE_DESIGN_AND_DEPLOY does not create a backup copy of the current design before deploying the new design. Before running this function, back up the existing schema design with EXPORT_CATALOG.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax


DESIGNER_RUN_POPULATE_DESIGN_AND_DEPLOY (
    'design-name',
    'output-design-file',
    'output-deployment-file'
    [ , 'analyze-statistics']
    [ , 'deploy']
    [ , 'drop-design-workspace']
    [ , 'continue-after-error']
    )

Parameters

design-name: Name of the design to populate and deploy.
output-design-filename: Absolute path and name of the file to contain DDL statements that create design projections, on the local file system of the node where the session is connected, or another file system or object store that Vertica supports.
output-deployment-filename: Absolute path and name of the file to contain the deployment script, on the local file system of the node where the session is connected, or another file system or object store that Vertica supports.
analyze-statistics: Specifies whether to collect or refresh statistics for the tables before populating the design. If set to true, Vertica Invokes ANALYZE_STATISTICS. Accurate statistics help Database Designer optimize compression and query performance. However, updating statistics requires time and resources.
Default: false
deploy: Specifies whether to deploy the Database Designer design using the deployment script created by this function.
Default: true
drop-design-workspace: Specifies whether to drop the design workspace after the design is deployed.
Default: true
continue-after-error: Specifies whether DESIGNER_RUN_POPULATE_DESIGN_AND_DEPLOY continues to run after an error occurs. By default, an error causes this function to terminate.
Default: false

Privileges

Non-superuser: design creator with WRITE privileges on storage locations of design and deployment scripts

Requirements

Before calling this function, you must:

Create a design, a logical schema with tables.
Associate tables with the design.
Load queries to the design.
Set design properties (K-safety level, mode, and policy).

Examples

The following example creates projections for and deploys the VMART_DESIGN design, and analyzes statistics about the design tables.

=> SELECT DESIGNER_RUN_POPULATE_DESIGN_AND_DEPLOY (
   'VMART_DESIGN',
   '/tmp/examples/vmart_design_files/design_projections.sql',
   '/tmp/examples/vmart_design_files/design_deploy.sql',
   'true',
   'true',
   'false',
   'false'
   );

9.14 - DESIGNER_SET_DESIGN_KSAFETY

Sets K-safety for a comprehensive design and stores the K-safety value in the DESIGNS table.

Sets K-safety for a comprehensive design and stores the K-safety value in the DESIGNS table. Database Designer ignores this function for incremental designs.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DESIGNER_SET_DESIGN_KSAFETY ( 'design-name' [, k-level ] )

Parameters

design-name

Name of the design for which you want to set the K-safety value, type VARCHAR.

k-level

An integer between 0 and 2 that specifies the level of K-safety for the target design. This value must be compatible with the number of nodes in the database cluster:

k-level = 0: ≥ 1 nodes
k-level = 1: ≥ 3 nodes
k-level = 2: ≥ 5 nodes

If you omit this parameter, Vertica sets K-safety for this design to 0 or 1, according to the number of nodes: 1 if the cluster contains ≥ 3 nodes, otherwise 0.

If you are a DBADMIN user and k-level differs from system K-safety, Vertica changes system K-safety as follows:

If k-level is less than system K-safety, Vertica changes system K-safety to the lower level after the design is deployed.
If k-level is greater than system K-safety and is valid for the database cluster, Vertica creates the required number of buddy projections for the tables in this design. If the design applies to all database tables, or all tables in the database have the required number of buddy projections, Database Designer changes system K-safety to k-level.

If the design excludes some database tables and the number of their buddy projections is less than k-level, Database Designer leaves system K-safety unchanged. Instead, it returns a warning and indicates which tables need new buddy projections in order to adjust system K-safety.

If you are a DBDUSER, Vertica ignores this parameter.

Privileges

Non-superuser: design creator

Examples

The following example set K-safety for the VMART_DESIGN design to 1:

=> SELECT DESIGNER_SET_DESIGN_KSAFETY('VMART_DESIGN', 1);

9.15 - DESIGNER_SET_DESIGN_TYPE

Specifies whether Database Designer creates a comprehensive or incremental design.

Specifies whether Database Designer creates a comprehensive or incremental design. DESIGNER_SET_DESIGN_TYPE stores the design mode in the DESIGNS table.

Important

If you do not explicitly set a design mode with this function, Database Designer creates a comprehensive design.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DESIGNER_SET_DESIGN_TYPE ( 'design-name', 'mode' )

Parameters

design-name

Name of the target design.

mode

Name of the mode that Database Designer should use when designing the database, one of the following:

COMPREHENSIVE: Creates an initial or replacement design for all tables in the specified schemas. You typically create a comprehensive design for a new database.
INCREMENTAL: Modifies an existing design with additional projection that are optimized for new or modified queries.

Note
Incremental designs always inherit the K-safety value of the database.

For more information, see Design types.

Privileges

Non-superuser: design creator

Examples

The following examples show the two design mode options for the VMART_DESIGN design:

=> SELECT DESIGNER_SET_DESIGN_TYPE(
    'VMART_DESIGN',
    'COMPREHENSIVE');
DESIGNER_SET_DESIGN_TYPE
--------------------------
                        0
(1 row)
=> SELECT DESIGNER_SET_DESIGN_TYPE(
    'VMART_DESIGN',
    'INCREMENTAL');
 DESIGNER_SET_DESIGN_TYPE
--------------------------
                        0
(1 row)

9.16 - DESIGNER_SET_OPTIMIZATION_OBJECTIVE

Valid only for comprehensive database designs, specifies the optimization objective Database Designer uses.

Valid only for comprehensive database designs, specifies the optimization objective Database Designer uses. Database Designer ignores this function for incremental designs.

DESIGNER_SET_OPTIMIZATION_OBJECTIVE stores the optimization objective in the DESIGNS table.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DESIGNER_SET_OPTIMIZATION_OBJECTIVE ( 'design-name', 'policy' )

Parameters

design-name

Name of the target design.

policy

Specifies the design's optimization policy, one of the following:

QUERY: Optimize for query performance. This can result in a larger database storage footprint because additional projections might be created.
LOAD: Optimize for load performance so database size is minimized. This can result in slower query performance.
BALANCED: Balance the design between query performance and database size.

Privileges

Non-superuser: design creator

Examples

The following example sets the optimization objective option for the VMART_DESIGN design: to QUERY:


=> SELECT DESIGNER_SET_OPTIMIZATION_OBJECTIVE(  'VMART_DESIGN', 'QUERY');
 DESIGNER_SET_OPTIMIZATION_OBJECTIVE
------------------------------------
                                  0
(1 row)

9.17 - DESIGNER_SET_PROPOSE_UNSEGMENTED_PROJECTIONS

Specifies whether a design can include unsegmented projections.

Specifies whether a design can include unsegmented projections. Vertica ignores this function on a one-node cluster, where all projections must be unsegmented.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DESIGNER_SET_PROPOSE_UNSEGMENTED_PROJECTIONS ( 'design-name', unsegmented )

Parameters

design-name: Name of the target design.
unsegmented: Boolean that specifies whether Database Designer can propose unsegmented projections for tables in this design. When you create a design, the propose_unsegmented_projections value in system table DESIGNS for this design is set to true. If DESIGNER_SET_PROPOSE_UNSEGMENTED_PROJECTIONS sets this value to false, Database Designer proposes only segmented projections.

Privileges

Non-superuser: design creator

Examples

The following example specifies that Database Designer can propose only segmented projections for tables in the design VMART_DESIGN:

=> SELECT DESIGNER_SET_PROPOSE_UNSEGMENTED_PROJECTIONS('VMART_DESIGN', false);

9.18 - DESIGNER_SINGLE_RUN

Evaluates all queries that completed execution within the specified timespan, and returns with a design that is ready for deployment.

Evaluates all queries that completed execution within the specified timespan, and returns with a design that is ready for deployment. This design includes projections that are recommended for optimizing the evaluated queries. Unless you redirect output, DESIGNER_SINGLE_RUN returns the design to stdout.

Tip

Before running DESIGNER_SINGLE_RUN, collect statistics on the queried data by calling ANALYZE_STATISTICS and ANALYZE_STATISTICS_PARTITION.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DESIGNER_SINGLE_RUN ('interval')

interval: Specifies an interval of time that precedes the meta-function call. Database Designer evaluates all queries that ran to completion over the specified interval.

Privileges

Superuser or DBUSER

Examples

-----------------------------------------------------------------------
-- SSBM dataset test
-----------------------------------------------------------------------
-- create ssbm schema
\! $TARGET/bin/vsql -f 'sql/SSBM/SSBM_schema.sql' > /dev/null 2>&1
\! $TARGET/bin/vsql -f 'sql/SSBM/SSBM_constraints.sql' > /dev/null 2>&1
\! $TARGET/bin/vsql -f 'sql/SSBM/SSBM_funcdeps.sql' > /dev/null 2>&1

-- run these queries
\! $TARGET/bin/vsql -f 'sql/SSBM/SSBM_queries.sql' > /dev/null 2>&1
-- Run single API
select designer_single_run('1 minute');

...
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                designer_single_run
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 CREATE PROJECTION public.part_DBD_1_rep_SingleDesign /*+createtype(D)*/
(
 p_partkey ENCODING AUTO,
 p_name ENCODING AUTO,
 p_mfgr ENCODING AUTO,
 p_category ENCODING AUTO,
 p_brand1 ENCODING AUTO,
 p_color ENCODING AUTO,
 p_type ENCODING AUTO,
 p_size ENCODING AUTO,
 p_container ENCODING AUTO
)
AS
 SELECT p_partkey,
        p_name,
        p_mfgr,
        p_category,
        p_brand1,
        p_color,
        p_type,
        p_size,
        p_container
 FROM public.part
 ORDER BY p_partkey
UNSEGMENTED ALL NODES;

CREATE PROJECTION public.supplier_DBD_2_rep_SingleDesign /*+createtype(D)*/
(
 s_suppkey ENCODING AUTO,
 s_name ENCODING AUTO,
 s_address ENCODING AUTO,
 s_city ENCODING AUTO,
 s_nation ENCODING AUTO,
 s_region ENCODING AUTO,
 s_phone ENCODING AUTO
)
AS
 SELECT s_suppkey,
        s_name,
        s_address,
        s_city,
        s_nation,
        s_region,
        s_phone
 FROM public.supplier
 ORDER BY s_suppkey
UNSEGMENTED ALL NODES;

CREATE PROJECTION public.customer_DBD_3_rep_SingleDesign /*+createtype(D)*/
(
 c_custkey ENCODING AUTO,
 c_name ENCODING AUTO,
 c_address ENCODING AUTO,
 c_city ENCODING AUTO,
 c_nation ENCODING AUTO,
 c_region ENCODING AUTO,
 c_phone ENCODING AUTO,
 c_mktsegment ENCODING AUTO
)
AS
 SELECT c_custkey,
        c_name,
        c_address,
        c_city,
        c_nation,
        c_region,
        c_phone,
        c_mktsegment
 FROM public.customer
 ORDER BY c_custkey
UNSEGMENTED ALL NODES;

CREATE PROJECTION public.dwdate_DBD_4_rep_SingleDesign /*+createtype(D)*/
(
 d_datekey ENCODING AUTO,
 d_date ENCODING AUTO,
 d_dayofweek ENCODING AUTO,
 d_month ENCODING AUTO,
 d_year ENCODING AUTO,
 d_yearmonthnum ENCODING AUTO,
 d_yearmonth ENCODING AUTO,
 d_daynuminweek ENCODING AUTO,
 d_daynuminmonth ENCODING AUTO,
 d_daynuminyear ENCODING AUTO,
 d_monthnuminyear ENCODING AUTO,
 d_weeknuminyear ENCODING AUTO,
 d_sellingseason ENCODING AUTO,
 d_lastdayinweekfl ENCODING AUTO,
 d_lastdayinmonthfl ENCODING AUTO,
 d_holidayfl ENCODING AUTO,
 d_weekdayfl ENCODING AUTO
)
AS
 SELECT d_datekey,
        d_date,
        d_dayofweek,
        d_month,
        d_year,
        d_yearmonthnum,
        d_yearmonth,
        d_daynuminweek,
        d_daynuminmonth,
        d_daynuminyear,
        d_monthnuminyear,
        d_weeknuminyear,
        d_sellingseason,
        d_lastdayinweekfl,
        d_lastdayinmonthfl,
        d_holidayfl,
        d_weekdayfl
 FROM public.dwdate
 ORDER BY d_datekey
UNSEGMENTED ALL NODES;

CREATE PROJECTION public.lineorder_DBD_5_rep_SingleDesign /*+createtype(D)*/
(
 lo_orderkey ENCODING AUTO,
 lo_linenumber ENCODING AUTO,
 lo_custkey ENCODING AUTO,
 lo_partkey ENCODING AUTO,
 lo_suppkey ENCODING AUTO,
 lo_orderdate ENCODING AUTO,
 lo_orderpriority ENCODING AUTO,
 lo_shippriority ENCODING AUTO,
 lo_quantity ENCODING AUTO,
 lo_extendedprice ENCODING AUTO,
 lo_ordertotalprice ENCODING AUTO,
 lo_discount ENCODING AUTO,
 lo_revenue ENCODING AUTO,
 lo_supplycost ENCODING AUTO,
 lo_tax ENCODING AUTO,
 lo_commitdate ENCODING AUTO,
 lo_shipmode ENCODING AUTO
)
AS
 SELECT lo_orderkey,
        lo_linenumber,
        lo_custkey,
        lo_partkey,
        lo_suppkey,
        lo_orderdate,
        lo_orderpriority,
        lo_shippriority,
        lo_quantity,
        lo_extendedprice,
        lo_ordertotalprice,
        lo_discount,
        lo_revenue,
        lo_supplycost,
        lo_tax,
        lo_commitdate,
        lo_shipmode
 FROM public.lineorder
 ORDER BY lo_suppkey
UNSEGMENTED ALL NODES;

(1 row)

9.19 - DESIGNER_WAIT_FOR_DESIGN

Waits for completion of operations that are populating and deploying the design.

Waits for completion of operations that are populating and deploying the design. Ctrl+C cancels this operation and returns control to the user.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DESIGNER_WAIT_FOR_DESIGN ( 'design-name' )

Parameters

design-name: Name of the running design.

Privileges

Superuser, or DBDUSER with USAGE privilege on the design schema

Examples

The following example requests to wait for the currently running design of VMART_DESIGN to complete:

=> SELECT DESIGNER_WAIT_FOR_DESIGN ('VMART_DESIGN');

10 - Database management functions

This section contains the database management functions specific to Vertica.

10.1 - CLEAR_RESOURCE_REJECTIONS

Clears the content of the RESOURCE_REJECTIONS and DISK_RESOURCE_REJECTIONS system tables.

Clears the content of the RESOURCE_REJECTIONS and DISK_RESOURCE_REJECTIONS system tables. Normally, these tables are only cleared during a node restart. This function lets you clear the tables whenever you need. For example, you might want to clear the system tables after you resolved a disk space issue that was causing disk resource rejections.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Immutable

Syntax

CLEAR_RESOURCE_REJECTIONS();

Privileges

Superuser

Examples

The following command clears the content of the RESOURCE_REJECTIONS and DISK_RESOURCE_REJECTIONS system tables:

=> SELECT clear_resource_rejections();
clear_resource_rejections
---------------------------
 OK
(1 row)

10.2 - COMPACT_STORAGE

Bundles existing data (.fdb) and index (.pidx) files into the .gt file format.

Bundles existing data (.fdb) and index (.pidx) files into the .gt file format. The .gt format is enabled by default for data files created version 7.2 or later. If you upgrade a database from an earlier version, use COMPACT_STORAGE to bundle storage files into the .gt format. Your database can continue to operate with a mix of file storage formats.

If the settings you specify for COMPACT_STORAGE vary from the limit specified in configuration parameter MaxBundleableROSSizeKB, Vertica does not change the size of the automatically created bundles.

Note

Run this function during periods of low demand.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SELECT COMPACT_STORAGE ('[[[database.]schema.]object-name]', min-ros-filesize-kb, 'small-or-all-files', 'simulate');

Parameters

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

object-name

Specifies the table or projection to bundle. If set to an empty string, COMPACT_STORAGE evaluates the data of all projections in the database for bundling.

min-ros-filesize-kb

Integer ≥ 1, specifies in kilobytes the minimum size of an independent ROS file. COMPACT_STORAGE bundles storage container ROS files below this size into a single file.

small-or-all-files

One of the following:

small: Bundles only files smaller than the limit specified in min-ros-filesize-kb
all: Bundles files smaller than the limit specified in min-ros-filesize-kb and bundles the .fdb and .pidx files for larger storage containers.

simulate

Specifies whether to simulate the storage settings and produce a report describing the impact of those settings.

true: Produces a report on the impact of the specified bundle settings without actually bundling storage files.
false: Performs the bundling as specified.

Privileges

Storage and performance impact

Bundling reduces the number of files in your file system by at least fifty percent and improves the performance of file-intensive operations. Improved operations include backups, restores, and mergeout.

Vertica creates small files for the following reasons:

Tables contain hundreds of columns.
Partition ranges are small (partition by minute).
Local segmentation is enabled and your factor is set to a high value.

Examples

The following example describes the impact of bundling the table EMPLOYEES:

=> SELECT COMPACT_STORAGE('employees', 1024,'small','true');
Task: compact_storage

On node v_vmart_node0001:
Projection Name :public.employees_b0 | selected_storage_containers :0 |
selected_files_to_compact :0 | files_after_compact : 0 | modified_storage_KB :0

On node v_vmart_node0002:
Projection Name :public.employees_b0 | selected_storage_containers :1 |
selected_files_to_compact :6 | files_after_compact : 1 | modified_storage_KB :0

On node v_vmart_node0003:
Projection Name :public.employees_b0 | selected_storage_containers :2 |
selected_files_to_compact :12 | files_after_compact : 2 | modified_storage_KB :0

On node v_vmart_node0001:
Projection Name :public.employees_b1 | selected_storage_containers :2 |
selected_files_to_compact :12 | files_after_compact : 2 | modified_storage_KB :0

On node v_vmart_node0002:
Projection Name :public.employees_b1 | selected_storage_containers :0 |
selected_files_to_compact :0 | files_after_compact : 0 | modified_storage_KB :0

On node v_vmart_node0003:
Projection Name :public.employees_b1 | selected_storage_containers :1 |
selected_files_to_compact :6 | files_after_compact : 1 | modified_storage_KB :0

Success

(1 row)

10.3 - CURRENT_SCHEMA

Returns the name of the current schema.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CURRENT_SCHEMA()

Note

You can call this function without parentheses.

Privileges

None

Examples

The following command returns the name of the current schema:

=> SELECT CURRENT_SCHEMA();
 current_schema
----------------
 public
(1 row)

The following command returns the same results without the parentheses:

=> SELECT CURRENT_SCHEMA;
 current_schema
----------------
 public
(1 row)

The following command shows the current schema, listed after the current user, in the search path:

=> SHOW SEARCH_PATH;
    name     |                      setting
-------------+---------------------------------------------------
 search_path | "$user", public, v_catalog, v_monitor, v_internal
(1 row)

10.4 - DUMP_LOCKTABLE

Returns information about deadlocked clients and the resources they are waiting for.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DUMP_LOCKTABLE()

Privileges

None

Notes

Use DUMP_LOCKTABLE if Vertica becomes unresponsive:

Open an additional vsql connection.
Execute the query:
```
=> SELECT DUMP_LOCKTABLE();
```
The output is written to vsql. See Monitoring the Log Files.

You can also see who is connected using the following command:

=> SELECT * FROM SESSIONS;

Close all sessions using the following command:

=> SELECT CLOSE_ALL_SESSIONS();

Close a single session using the following command:

=> SELECT CLOSE_SESSION('session_id');

You get the session_id value from the V_MONITOR.SESSIONS system table.

10.5 - DUMP_PARTITION_KEYS

Dumps the partition keys of all projections in the system.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DUMP_PARTITION_KEYS( )

Note

The ROS objects of partitioned tables without partition keys are ignored by the tuple mover and are not merged during automatic tuple mover operations.

Privileges

User must have select privileges on the table or usage privileges on the schema.

Examples

=> SELECT DUMP_PARTITION_KEYS( );
Partition keys on node v_vmart_node0001
  Projection 'states_b0'
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: NH
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: MA
  Projection 'states_b1'
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: VT
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: ME
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: CT

10.6 - GET_CONFIG_PARAMETER

Gets the value of a configuration parameter at the specified level.

Gets the value of a configuration parameter at the specified level. If no value is set at that level, the function returns an empty row.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

GET_CONFIG_PARAMETER( 'parameter-name' [, 'level' | NULL] )

Parameters

parameter-name

Name of the configuration parameter value to get.

level

Level at which to get parameter-name's setting, one of the following string values:

user: Current user
session: Current session
node-name: Name of database node

If level is omitted or set to NULL, GET_CONFIG_PARAMETER returns the database setting.

Privileges

None

Examples

Get the AnalyzeRowCountInterval parameter at the database level:

=> SELECT GET_CONFIG_PARAMETER ('AnalyzeRowCountInterval');
 GET_CONFIG_PARAMETER
----------------------
 3600

Get the MaxSessionUDParameterSize parameter at the session level:

=> SELECT GET_CONFIG_PARAMETER ('MaxSessionUDParameterSize','session');
 GET_CONFIG_PARAMETER
----------------------
 2000
(1 row)

Get the UseDepotForReads parameter at the user level:

=> SELECT GET_CONFIG_PARAMETER ('UseDepotForReads', 'user');
 GET_CONFIG_PARAMETER
----------------------
 1
(1 row)

10.7 - KERBEROS_CONFIG_CHECK

Tests the Kerberos configuration of a Vertica cluster.

Tests the Kerberos configuration of a Vertica cluster. The function succeeds if it can kinit with both the keytab file and the current user's credential, and reports errors otherwise.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

KERBEROS_CONFIG_CHECK( )

Parameters

This function has no parameters.

Privileges

This function does not require privileges.

Examples

The following example shows the results when the Kerberos configuration is valid.

=> SELECT KERBEROS_CONFIG_CHECK();
    kerberos_config_check
-----------------------------------------------------------------------------
 ok: krb5 exists at [/etc/krb5.conf]
 ok: Vertica Keytab file is set to [/etc/vertica.keytab]
 ok: Vertica Keytab file exists at [/etc/vertica.keytab]
[INFO] KerberosCredentialCache [/tmp/vertica_D4/vertica450676899262134963.cc]
 Kerberos configuration parameters set in the database
        KerberosServiceName : [vertica]
        KerberosHostname : [data.hadoop.com]
        KerberosRealm : [EXAMPLE.COM]
        KerberosKeytabFile : [/etc/vertica.keytab]
 Vertica Principal: [vertica/data.hadoop.com@EXAMPLE.COM]
 [OK] Vertica can kinit using keytab file
 [OK] User [bob] has valid client authentication for kerberos principal [bob@EXAMPLE.COM]]

(1 row)

10.8 - MEMORY_TRIM

Calls glibc function to reclaim free memory from malloc and return it to the operating system.

Calls glibc function malloc_trim() to reclaim free memory from malloc and return it to the operating system. Details on the trim operation are written to system table MEMORY_EVENTS.

Unless you turn off memory polling, Vertica automatically detects when glibc accumulates an excessive amount of free memory in its allocation arena. When this occurs, Vertica consolidates much of this memory and returns it to the operating system. Call this function if you disable memory polling and wish to reduce glibc-allocated memory manually.

For more information, see Memory trimming.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

MEMORY_TRIM()

Privileges

Superuser

Examples

=> SELECT memory_trim();
                           memory_trim
-----------------------------------------------------------------
 Pre-RSS: [378822656] Post-RSS: [372129792] Benefit: [0.0176675]
(1 row)

10.9 - PURGE

Permanently removes delete vectors from ROS storage containers so disk space can be reused.

Permanently removes delete vectors from ROS storage containers so disk space can be reused. PURGE removes all historical data up to and including the Ancient History Mark epoch.

PURGE does not delete temporary tables.

Caution

PURGE can temporarily use significant disk space.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SELECT PURGE()

Privileges

Table owner
USAGE privilege on schema

Examples

After you delete data from a Vertica table, that data is marked for deletion. To see the data that is marked for deletion, query system table DELETE_VECTORS.

Run PURGE to remove the delete vectors from ROS containers.

=> SELECT * FROM test1;
 number
--------
      3
     12
     33
     87
     43
     99
(6 rows)
=> DELETE FROM test1 WHERE number > 50;
 OUTPUT
--------
      2
(1 row)
=> SELECT * FROM test1;
 number
--------
     43
      3
     12
     33
(4 rows)
=> SELECT node_name, projection_name, deleted_row_count FROM DELETE_VECTORS;
    node_name     | projection_name | deleted_row_count
------------------+-----------------+-------------------
 v_vmart_node0002 | test1_b1        |                 1
 v_vmart_node0001 | test1_b1        |                 1
 v_vmart_node0001 | test1_b0        |                 1
 v_vmart_node0003 | test1_b0        |                 1
(4 rows)
=> SELECT PURGE();
...
(Table: public.test1) (Projection: public.test1_b0)
(Table: public.test1) (Projection: public.test1_b1)
...
(4 rows)

After the ancient history mark (AHM) advances:

=> SELECT * FROM DELETE_VECTORS;
 (No rows)

10.10 - RUN_INDEX_TOOL

Runs the Index tool on a Vertica database to perform one of these tasks:.

Runs the Index tool on a Vertica database to perform one of these tasks:

Run a per-block cyclic redundancy check (CRC) on data storage to verify data integrity.
Check that the sort order in ROS containers is correct.

The function writes summary information about its operation to standard output; detailed information on results is logged in vertica.log on the current node. For more about evaluating tool output, see:

You can also run the Index tool on a database that is down, from the Linux command line. For details, see CRC and sort order check.

Caution

Use this function only under guidance from Vertica Support.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

RUN_INDEX_TOOL ( 'taskType', global, '[projFilter]' [, numThreads ] );

Parameters

*taskType*

Specifies the operation to run, one of the following:

checkcrc: Run a cyclic redundancy check (CRC) on each block of existing data storage to check the data integrity of ROS data blocks.
checksort: Evaluate each ROS row to determine whether it is sorted correctly. If ROS data is not sorted correctly in the projection's order, query results that rely on sorted data will be incorrect.

*global*

Boolean, specifies whether to run the specified task on all nodes (true), or the current one (false).

*projFilter*

Specifies the scope of the operation:

Empty string (''): Run the check on all projections.
A string that specifies one or more projections as follows:
- projection-name: Run the check on this projection
- projection-prefix*: Run the check on all projections that begin with the string *projection-prefix*.

*numThreads*

An unsigned (positive) or signed (negative) integer that specifies the number of threads used to run this operation:

n: Number of threads, ≥ 1
-n: Negative integer, denotes a fraction of all CPU cores as follows:
```
num-cores / n
```
Thus, -1 specifies all cores, -2, half the cores, -3, a third of all cores, and so on.

Default: 1

Privileges

Superuser

Optimizing performance

You can optimize meta-function performance by setting two parameters:

projFilter: Narrows the scope of the operation to one or more projections.
numThreads: Specifies the number of threads used to execute the function.

10.11 - SECURITY_CONFIG_CHECK

Returns the status of various security-related parameters.

Returns the status of various security-related parameters. Use this function to verify completeness of your TLS configuration.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SECURITY_CONFIG_CHECK( 'db-component' )

Parameters

db-component: The component to check. Currently, NETWORK is the only supported component.
NETWORK: Returns the status and parameters for spread encryption, internode TLS, and client-server TLS.

Examples

In this example, SECURITY_CONFIG_CHECK shows that spread encryption and data channel TLS are disabled because EncryptSpreadComm is disabled and the data_channel TLS CONFIGURATION is not configured.

Similarly, client-server TLS is disabled because the TLS CONFIGURATION "server" has a server certificate, but its TLSMODE is disabled. Setting TLSMODE to 'Enable' enables server mode client-server TLS. See TLS protocol for details.

=> SELECT SECURITY_CONFIG_CHECK('NETWORK');
                                            SECURITY_CONFIG_CHECK
----------------------------------------------------------------------------------------------------------------------
Spread security details:
* EncryptSpreadComm = []
Spread encryption is disabled
It is NOT safe to set/change other security config parameters while spread is not encrypted!
Please set EncryptSpreadComm to enable spread encryption first

Data Channel security details:
 TLS Configuration 'data_channel' TLSMODE is DISABLE
TLS on the data channel is disabled
Please set EncryptSpreadComm and configure TLS Configuration 'data_channel' to enable TLS on the data channel

Client-Server network security details:
* TLS Configuration 'server' TLSMODE is DISABLE
* TLS Configuration 'server' has a certificate set
Client-Server TLS is disabled
To enable Client-Server TLS set a certificate on TLS Configuration 'server' and/or set the tlsmode to 'ENABLE' or higher

(1 row)

10.12 - SET_CONFIG_PARAMETER

Sets or clears a configuration parameter at the specified level.

Important

You can only use this function to set configuration parameters with string or integer values. To set configuration parameters that accept other data types, use the appropriate ALTER statement.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SET_CONFIG_PARAMETER( 'param-name', { param-value | NULL}, ['level'| NULL])

Arguments

param-name

Name of the configuration parameter to set. See Configuration parameters for details on supported parameters and valid settings.

param-value

Value to set for param-name, either a string or integer. If a string, enclose in single quotes; if an integer, single quotes are optional.

To clear param-name at the specified level, set to NULL.

level

Level at which to set param-name, one of the following string values:

user: Current user.
session: Current session, overrides the database setting.
node-name: Name of database node, overrides session and database settings.

If level is omitted or set to NULL, param-name is set at the database level.

Note

Some parameters require restart for the value to take effect.

Privileges

Examples

Set the AnalyzeRowCountInterval parameter to 3600 at the database level:

=> SELECT SET_CONFIG_PARAMETER('AnalyzeRowCountInterval',3600);
    SET_CONFIG_PARAMETER
----------------------------
 Parameter set successfully
(1 row)

Note

You can achieve the same result with ALTER DATABASE:

ALTER DATABASE DEFAULT SET PARAMETER AnalyzeRowCountInterval = 3600;

Set the MaxSessionUDParameterSize parameter to 2000 at the session level.

=> SELECT SET_CONFIG_PARAMETER('MaxSessionUDParameterSize',2000,'SESSION');
    SET_CONFIG_PARAMETER
----------------------------
 Parameter set successfully
(1 row)

10.13 - SET_SPREAD_OPTION

Changes daemon settings.

Changes spread daemon settings. This function is mainly used to set the timeout before spread assumes a node has gone down.

Important

Changing spread settings with SET_SPREAD_OPTION has minor impact on your cluster as it pauses while the new settings are propagated across the entire cluster.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SET_SPREAD_OPTION( option-name, option-value )

Parameters

option-name: String containing the spread daemon setting to change.
Currently, this function supports only one option: TokenTimeout. This setting controls how long spread waits for a node to respond to a message before assuming it is lost. See Adjusting Spread Daemon timeouts for virtual environments for more information.
option-value: The new setting for option-name.

Examples

=> SELECT SET_SPREAD_OPTION( 'TokenTimeout', '35000');
NOTICE 9003:  Spread has been notified about the change
                   SET_SPREAD_OPTION
--------------------------------------------------------
 Spread option 'TokenTimeout' has been set to '35000'.

(1 row)

=> SELECT * FROM V_MONITOR.SPREAD_STATE;
    node_name     | token_timeout
------------------+---------------
 v_vmart_node0001 |         35000
 v_vmart_node0002 |         35000
 v_vmart_node0003 |         35000
(3 rows);

10.14 - SHUTDOWN

Shuts down a Vertica database.

Shuts down a Vertica database. By default, the shutdown fails if any users are connected. You can check the status of the shutdown operation in the vertica.log file.

Tip

Before calling SHUTDOWN, you can close all current user connections and prevent further connection attempts as follows:

Temporarily set configuration parameter MaxClientSessions to 0.
Call CLOSE_ALL_SESSIONS to close all non-dbamin connections.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SHUTDOWN ( [ 'false' | 'true' ] )

Parameters

false: Default, returns a message if users are connected and aborts the shutdown.
true: Forces the database to shut down, disallowing further connections.

Privileges

Superuser

Examples

The following command attempts to shut down the database. Because users are connected, the command fails:

=> SELECT SHUTDOWN('false');
NOTICE:  Cannot shut down while users are connected
          SHUTDOWN
-----------------------------
 Shutdown: aborting shutdown
(1 row)

11 - Directed queries functions

The following meta-functions let you batch export query plans as directed queries from one Vertica database, and import those directed queries to another database.

11.1 - EXPORT_DIRECTED_QUERIES

Generates SQL for creating directed queries from a set of input queries.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

EXPORT_DIRECTED_QUERIES('input-file', '[output-file]')

Parameters

input-file: A SQL file that contains one or more input queries. See Input Format below for details on format requirements.
output-file: Specifies where to write the generated SQL for creating directed queries. If output-file already exists, EXPORT_DIRECTED_QUERIES returns with an error. If you supply an empty string, Vertica writes the SQL to standard output. See Output Format below for details.

Privileges

Input format

The input file that you supply to EXPORT_DIRECTED_QUERIES contains one or more input queries. For each input query, you can optionally specify two fields that are used in the generated directed query:

DirQueryName provides the directed query's unique identifier, a string that conforms to conventions described in Identifiers.
DirQueryComment specifies a quote-delimited string, up to 128 characters.

You format each input query as follows:

--DirQueryName=query-name
--DirQueryComment='comment'
input-query

Output format

EXPORT_DIRECTED_QUERIES generates SQL for creating directed queries, and writes the SQL to the specified file or to standard output. In both cases, output conforms to the following format:

/* Query: directed-query-name */
/* Comment: directed-query-comment */
SAVE QUERY input-query;
CREATE DIRECTED QUERY CUSTOM 'directed-query-name'
COMMENT 'directed-query-comment'
OPTVER 'vertica-release-num'
PSDATE 'timestamp'
annotated-query

If a given input query omits DirQueryName and DirQueryComment fields, EXPORT_DIRECTED_QUERIES automatically generates the following output:

/* Query: Autoname:timestamp.n */, where n is a zero-based integer index that ensures uniqueness among auto-generated names with the same timestamp.
/* Comment: Optimizer-generated directed query */

Error handling

If any errors or warnings occur during EXPORT_DIRECTED_QUERIES execution, it returns with a message like this one:

1 queries successfully exported.
1 warning message was generated.
Queries exported to /home/dbadmin/outputQueries.
See error report, /home/dbadmin/outputQueries.err for details.

EXPORT_DIRECTED_QUERIES writes all errors and warnings to a file that it creates on the same path as the output file, and uses the output file's base name.

For example:

---------------------------------------------------------------------------------------------------
WARNING: Name field not supplied. Using auto-generated name: 'Autoname:2016-04-25 15:03:32.115317.0'
Input Query: SELECT employee_dimension.employee_first_name, employee_dimension.employee_last_name, employee_dimension.job_title FROM public.employee_dimension WHERE (employee_dimension.employee_city = 'Boston'::varchar(6)) ORDER BY employee_dimension.job_title;
END WARNING

Examples

See Exporting directed queries.

11.2 - IMPORT_DIRECTED_QUERIES

Imports to the database catalog directed queries from a SQL file that was generated by EXPORT_DIRECTED_QUERIES.

Imports to the database catalog directed queries from a SQL file that was generated by EXPORT_DIRECTED_QUERIES. If no directed queries are specified, Vertica lists all directed queries in the SQL file.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

IMPORT_DIRECTED_QUERIES( 'export-file'[, 'directed-query-name'[,...] ] )

Parameters

export-file: A SQL file generated by EXPORT_DIRECTED_QUERIES. When you run this file, Vertica creates the specified directed queries in the current database catalog.
directed-query-name: The name of a directed query that is defined in export-file. You can specify multiple comma-delimited directed query names.
If you omit this parameter, Vertica lists the names of all directed queries in export-file.

Privileges

Examples

See Importing directed queries.

12 - Eon Mode functions

The following functions are meant to be used in Eon Mode.

12.1 - ALTER_LOCATION_SIZE

Resizes on one node, all nodes in a subcluster, or all nodes in the database.

Eon Mode only

Resizes the depot on one node, all nodes in a subcluster, or all nodes in the database.

Important

Reducing the size of the depot is liable to increase contention over depot usage and require frequent evictions. This behavior can increase the number of queries and load operations that are routed to communal storage for processing, which can incur slower performance and increased access charges.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Immutable

Syntax

ALTER_LOCATION_SIZE( 'location', '[target]', 'size')

Parameters

location

Specifies the location to resize, one of the following:

depot: Resizes the node's current depot.
The depot's absolute path in the Linux filesystem. If you change the depot size on multiple nodes and specify a path, the path must be identical on all affected nodes . By default, this is not the case, as the node's name is typically this path. For example, the default depot path for node 1 in the verticadb database is /vertica/data/verticadb/v_verticadb_node0001_depot.

target

The node or nodes on which to change the depot, one of the following:

Node name: Resize the specified node.
Subcluster name: Resize depots of all nodes in the specified subcluster.
Empty string: Resize all depots in the database.

size

Valid only if the storage location usage type is set to DEPOT, specifies the maximum amount of disk space that the depot can allocate from the storage location's file system.

You can specify size in two ways:

integer%: Percentage of storage location disk size.
integer{K|M|G|T}: Amount of storage location disk size in kilobytes, megabytes, gigabytes, or terabytes.

Important

The depot size cannot exceed 80 percent of the file system disk space where the depot is stored. If you specify a value that is too large, Vertica issues a warning and automatically changes the value to 80 percent of the file system size.

Privileges

Examples

Increase depot size on all nodes to 80 percent of file system:

=> SELECT node_name, location_label, location_path, max_size, disk_percent FROM storage_locations WHERE location_usage = 'DEPOT' ORDER BY node_name;
    node_name     | location_label  |      location_path      |  max_size   | disk_percent
------------------+-----------------+-------------------------+-------------+--------------
 v_vmart_node0001 | auto-data-depot | /home/dbadmin/verticadb | 36060108800 | 70%
 v_vmart_node0002 | auto-data-depot | /home/dbadmin/verticadb | 36059377664 | 70%
 v_vmart_node0003 | auto-data-depot | /home/dbadmin/verticadb | 36060108800 | 70%
(3 rows)

=> SELECT alter_location_size('depot', '','80%');
 alter_location_size
---------------------
 depotSize changed.
(1 row)

=> SELECT node_name, location_label, location_path, max_size, disk_percent FROM storage_locations WHERE location_usage = 'DEPOT' ORDER BY node_name;
    node_name     | location_label  |      location_path      |  max_size   | disk_percent
------------------+-----------------+-------------------------+-------------+--------------
 v_vmart_node0001 | auto-data-depot | /home/dbadmin/verticadb | 41211552768 | 80%
 v_vmart_node0002 | auto-data-depot | /home/dbadmin/verticadb | 41210717184 | 80%
 v_vmart_node0003 | auto-data-depot | /home/dbadmin/verticadb | 41211552768 | 80%
(3 rows)

Change the depot size to 75% of the filesystem size for all nodes in the analytics subcluster:

=> SELECT subcluster_name, subclusters.node_name, storage_locations.max_size, storage_locations.disk_percent FROM subclusters INNER JOIN storage_locations ON subclusters.node_name = storage_locations.node_name WHERE storage_locations.location_usage='DEPOT';
  subcluster_name   |      node_name       |   max_size  | disk_percent
--------------------+----------------------+----------------------------
 default_subcluster | v_verticadb_node0001 | 25264737485 | 60%
 default_subcluster | v_verticadb_node0002 | 25264737485 | 60%
 default_subcluster | v_verticadb_node0003 | 25264737485 | 60%
 analytics          | v_verticadb_node0004 | 25264737485 | 60%
 analytics          | v_verticadb_node0005 | 25264737485 | 60%
 analytics          | v_verticadb_node0006 | 25264737485 | 60%
 analytics          | v_verticadb_node0007 | 25264737485 | 60%
 analytics          | v_verticadb_node0008 | 25264737485 | 60%
 analytics          | v_verticadb_node0009 | 25264737485 | 60%
(9 rows)

=> SELECT ALTER_LOCATION_SIZE('depot','analytics','75%');
 ALTER_LOCATION_SIZE
---------------------
 depotSize changed.
(1 row)

=> SELECT subcluster_name, subclusters.node_name, storage_locations.max_size, storage_locations.disk_percent FROM subclusters INNER JOIN storage_locations ON subclusters.node_name = storage_locations.node_name WHERE storage_locations.location_usage='DEPOT';
  subcluster_name   |      node_name       |   max_size  | disk_percent
--------------------+----------------------+----------------------------
 default_subcluster | v_verticadb_node0001 | 25264737485 | 60%
 default_subcluster | v_verticadb_node0002 | 25264737485 | 60%
 default_subcluster | v_verticadb_node0003 | 25264737485 | 60%
 analytics          | v_verticadb_node0004 | 31580921856 | 75%
 analytics          | v_verticadb_node0005 | 31580921856 | 75%
 analytics          | v_verticadb_node0006 | 31580921856 | 75%
 analytics          | v_verticadb_node0007 | 31580921856 | 75%
 analytics          | v_verticadb_node0008 | 31580921856 | 75%
 analytics          | v_verticadb_node0009 | 31580921856 | 75%
(9 rows)

12.2 - BACKGROUND_DEPOT_WARMING

Vertica version 10.0.0 removes support for foreground depot warming.

Eon Mode only

Deprecated

Vertica version 10.0.0 removes support for foreground depot warming. When enabled, depot warming always happens in the background. Because foreground depot warming no longer exists, this function serves no purpose and has been deprecated. Calling it has no effect.

Forces a node that is warming its depot to start processing queries while continuing to warm its depot in the background. Depot warming only occurs when a node is joining the database and is activating its subscriptions. This function only has an effect if:

The database is running in Eon Mode.
The node is currently warming its depot.
The node is warming its depot from communal storage. This is the case when the UseCommunalStorageForBatchDepotWarming configuration parameter is set to the default value of 1. See Eon Mode parameters for more information about this parameter.

After calling this function, the node warms its depot in the background while taking part in queries.

This function has no effect on a node that is not warming its depot.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

BACKGROUND_DEPOT_WARMING('node-name' [, 'subscription-name'])

Arguments

*node-name*: The name of the node that you want to warm its depot in the background.
*subscription-name*: The name of a shard that the node subscribes to that you want the node to warm in the background. You can find the names of the shards a node subscribes to in the SHARD_NAME column of the NODE_SUBSCRIPTIONS system table.

Note
When you supply the name of a specific shard subscription to warm in the background, the node may not immediately begin processing queries. It continues to warm any other shard subscriptions in the foreground if they are not yet warm. The node does not begin taking part in queries until it finishes warming the other subscriptions.

Return value

A message indicating that the node's warming will continue in the background.

Privileges

The user must be a superuser .

Examples

The following example demonstrates having node 6 of the verticadb database warm its depot in the background:


=> SELECT BACKGROUND_DEPOT_WARMING('v_verticadb_node0006');
                          BACKGROUND_DEPOT_WARMING
----------------------------------------------------------------------------
 Depot warming running in background. Check monitoring tables for progress.
(1 row)

12.3 - CANCEL_DEPOT_WARMING

Cancels depot warming on a node.

Eon Mode only

Cancels depot warming on a node. Depot warming only occurs when a node is joining the database and is activating its subscriptions. You can choose to cancel all warming on the node, or cancel the warming of a specific shard's subscription. The node finishes whatever data transfers it is currently carrying out to warm its depot and removes pending warming-related transfers from its queue. It keeps any data it has already loaded into its depot. If you cancel warming for a specific subscription, it stops warming its depot if all of its other subscriptions are warmed. If they aren't warmed, the node continues to warm those other subscriptions.

This function only has an effect if:

The database is running in Eon Mode.
The node is currently warming its depot.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CANCEL_DEPOT_WARMING('node-name' [, 'subscription-name'])

Arguments

'node-name': The name of the node whose depot warming you want canceled.
'subscription-name': The name of a shard that the node subscribes to that you want the node to stop warming. You can find the names of the shards a node subscribes to in the SHARD_NAME column of the NODE_SUBSCRIPTIONS system table.

Return value

Returns a message indicating warming has been canceled.

Privileges

The user must be a superuser.

Usage considerations

Canceling depot warming can negatively impact the performance of your queries. A node with a cold depot may have to retrieve much of its data from communal storage, which is slower than accessing the depot.

Examples

The following demonstrates canceling the depot warming taking place on node 7:


=> SELECT CANCEL_DEPOT_WARMING('v_verticadb_node0007');
   CANCEL_DEPOT_WARMING
--------------------------
 Depot warming cancelled.
(1 row)

12.4 - CLEAN_COMMUNAL_STORAGE

Marks for deletion invalid data in communal storage, often data that leaked due to an event where Vertica cleanup mechanisms failed.

Eon Mode only

Marks for deletion invalid data in communal storage, often data that leaked due to an event where Vertica cleanup mechanisms failed. Events that require calling this function include:

Node failure
Interrupted migration of an Enterprise database to Eon
Restoring objects from backup

Tip

It is generally good practice to call CLEAN_COMMUNAL_STORAGE soon after completing an Enterprise-to-Eon migration, and reviving the migrated Eon database.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CLEAN_COMMUNAL_STORAGE ( ['actually-delete'] )

Parameters

actually-delete

BOOLEAN, specifies whether to queue data files for deletion:

true (default): Add files to the reaper queue and return immediately. The queued files are removed automatically by the reaper service, or can be removed manually by calling FLUSH_REAPER_QUEUE.
false: Report information about extra files but do not queue them for deletion.

Privileges

Superuser

Examples

=> SELECT CLEAN_COMMUNAL_STORAGE('true')
CLEAN_COMMUNAL_STORAGE
------------------------------------------------------------------
CLEAN COMMUNAL STORAGE
Task was canceled.
Total leaked files: 9265
Total size: 4236501526
Files have been queued for deletion.
Check communal_cleanup_records for more information.
(1 row)

12.5 - CLEAR_DATA_DEPOT

Deletes the specified depot data.

Eon Mode only

Deletes the specified depot data. You can clear depot data of a single table or all tables, from one subcluster, a single node, or the entire database cluster. Clearing depot data has no effect on communal storage.

Note

Clearing depot data can incur extra processing time for any subsequent queries that require that data and must now fetch it from communal storage.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CLEAR_DATA_DEPOT( [ '[table-name]' [, '[target-depots]'] ] )

Arguments

Note

To clear all depot data from the database cluster, call this function with no arguments.

table-name

Name of the table to delete from the target depots. If you omit a table name or supply an empty string, data of all tables is deleted from the target depots.

target-depots

Specifies to clear all data from the specified depots, one of the following:

subcluster-name: Clears depot data from the specified subcluster.
node-name: Clears depot data from the specified node. Depot data on other nodes in the same subcluster are unaffected.

This argument optionally qualifies the argument for table-name. If you omit this argument or supply an empty string, Vertica clears all depot data from the database cluster.

Privileges

Superuser

Examples

Clear all depot data for table t1 table from the depot of subcluster subcluster_1:

=> SELECT CLEAR_DATA_DEPOT('t1', 'subcluster_1');
 clear_data_depot
------------------
 Depot cleared
(1 row)

Clear all depot data from subcluster subcluster_1:

=> SELECT CLEAR_DATA_DEPOT('', 'subcluster_1');
 clear_data_depot
------------------
 Depot cleared
(1 row)

Clear all depot data from a single node:

=> select clear_data_depot('','v_vmart_node0001');
 clear_data_depot
------------------
 Depot cleared
(1 row)

Clear all depot data for table t1 from the database cluster:

=> SELECT CLEAR_DATA_DEPOT('t1');
 clear_data_depot
------------------
 Depot cleared
(1 row)

Clear all depot data from the database cluster:

=> SELECT CLEAR_DATA_DEPOT();
 clear_data_depot
------------------
 Depot cleared
(1 row)

12.6 - CLEAR_DEPOT_PIN_POLICY_PARTITION

Clears a depot pinning policy from the specified table or projection partitions.

Eon Mode only

Clears a depot pinning policy from the specified table or projection partitions. After the object is unpinned, it can be evicted from the depot by any unpinned or pinned object..

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CLEAR_DEPOT_PIN_POLICY_PARTITION( '[[database.]schema.]object-name', 'min-range-value', 'max-range-value' [, subcluster ] )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
object-name: The table or projection with a partition pinning policy to clear.
min-range-value max-range-value: Clears a pinning policy from the specified range of partition keys in table, where min-range-value must be ≤ max-range-value. If the policy applies to a single partition, min-range-value and max-range-value must be equal.
subcluster: Clears the specified pinning policy from the subcluster depot. If you omit this parameter, the policy is cleared from all database depots.

Privileges

Superuser

12.7 - CLEAR_DEPOT_PIN_POLICY_PROJECTION

Clears a depot pinning policy from the specified projection.

Eon Mode only

Clears a depot pinning policy from the specified projection. After the object is unpinned, it can be evicted from the depot by any unpinned or pinned object.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CLEAR_DEPOT_PIN_POLICY_PROJECTION( '[[database.]schema.]projection' [, 'subcluster' ] )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
projection: Projection with a pinning policy to clear.
subcluster: Clears the specified pinning policy from the subcluster depot. If you omit this parameter, the policy is cleared from all database depots.

Privileges

Superuser

12.8 - CLEAR_DEPOT_PIN_POLICY_TABLE

Clears a depot pinning policy from the specified table.

Eon Mode only

Clears a depot pinning policy from the specified table. After the object is unpinned, it can be evicted from the depot by any unpinned or pinned object.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CLEAR_DEPOT_PIN_POLICY_TABLE( '[[database.]schema.]table' [, 'subcluster' ] )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table: Table with a pinning policy to clear.
subcluster: Clears the specified pinning policy from the subcluster depot. If you omit this parameter, the policy is cleared from all database depots.

Privileges

Superuser

12.9 - CLEAR_FETCH_QUEUE

Removes all entries or entries for a specific transaction from the queue of fetch requests of data from the communal storage.

Eon Mode only

Removes all entries or entries for a specific transaction from the queue of fetch requests of data from the communal storage. You can view the fetch queue by querying the DEPOT_FETCH_QUEUE system table. This function removes all of the queued requests synchronously. It returns after all the fetches have been removed from the queue.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CLEAR_FETCH_QUEUE([transaction_id])

Parameters

*transaction_id*: The id of the transaction whose fetches will be cleared from the queue. If this value is not specified, all fetches are removed from the fetch queue.

Examples

This example clears all of the queued fetches for all transactions.

=> SELECT CLEAR_FETCH_QUEUE();

    CLEAR_FETCH_QUEUE

--------------------------

Cleared the fetch queue.

(1 row)

This example clears the fetch queue for a specific transaction.

=> SELECT node_name,transaction_id FROM depot_fetch_queue;
      node_name       |  transaction_id
----------------------+-------------------
 v_verticadb_node0001 | 45035996273719510
 v_verticadb_node0003 | 45035996273719510
 v_verticadb_node0002 | 45035996273719510
 v_verticadb_node0001 | 45035996273719777
 v_verticadb_node0003 | 45035996273719777
 v_verticadb_node0002 | 45035996273719777

(6 rows)

=> SELECT clear_fetch_queue(45035996273719510);
    clear_fetch_queue
--------------------------
 Cleared the fetch queue.
(1 row)

=> SELECT node_name,transaction_id from depot_fetch_queue;
      node_name       |  transaction_id
----------------------+-------------------
 v_verticadb_node0001 | 45035996273719777
 v_verticadb_node0003 | 45035996273719777
 v_verticadb_node0002 | 45035996273719777

(3 rows)

12.10 - DEMOTE_SUBCLUSTER_TO_SECONDARY

Converts a to a .

Eon Mode only

Converts a primary subcluster to a secondary subcluster.

Vertica will not allow you to demote a primary subcluster if any of the following are true:

The subcluster contains a critical node.
The subcluster is the only primary subcluster in the database. You must have at least one primary subcluster.
The initiator node is a member of the subcluster you are trying to demote. You must call DEMOTE_SUBCLUSTER_TO_SECONDARY from another subcluster.

Important

This function call can take a long time to complete because all of the nodes in the subcluster you are Set Snippet Variable Value in Topic will take a global catalog lock, write a checkpoint, and then commit. This global catalog lock can cause other database tasks to fail with errors.

Schedule calls to this function to occur when other database activity is low.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DEMOTE_SUBCLUSTER_TO_SECONDARY('subcluster-name')

Parameters

subcluster-name: The name of the primary subcluster to demote to a secondary subcluster.

Privileges

Superuser

Examples

The following example demotes the subcluster analytics_cluster to a secondary subcluster:

=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
  subcluster_name  | is_primary
-------------------+------------
 analytics_cluster | t
 load_subcluster   | t
(2 rows)

=> SELECT DEMOTE_SUBCLUSTER_TO_SECONDARY('analytics_cluster');
 DEMOTE_SUBCLUSTER_TO_SECONDARY
--------------------------------
 DEMOTE SUBCLUSTER TO SECONDARY
(1 row)

=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
  subcluster_name  | is_primary
-------------------+------------
 analytics_cluster | f
 load_subcluster   | t
(2 rows)

Attempting to demote the subcluster that contains the initiator node results in an error:

=> SELECT node_name FROM sessions WHERE user_name = 'dbadmin'
   AND client_type = 'vsql';
      node_name
----------------------
 v_verticadb_node0004
(1 row)

=> SELECT node_name, is_primary FROM subclusters WHERE subcluster_name = 'analytics';
      node_name       | is_primary
----------------------+------------
 v_verticadb_node0004 | t
 v_verticadb_node0005 | t
 v_verticadb_node0006 | t
(3 rows)

=> SELECT DEMOTE_SUBCLUSTER_TO_SECONDARY('analytics');
ERROR 9204:  Cannot promote or demote subcluster including the initiator node
HINT:  Run this command on another subcluster

12.11 - FINISH_FETCHING_FILES

Fetches to the depot all files that are queued for download from communal storage.

Eon Mode only

Fetches to the depot all files that are queued for download from communal storage.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

FINISH_FETCHING_FILES()

Privileges

Superuser

Examples

Get all files queued for download:

=> SELECT FINISH_FETCHING_FILES();
      FINISH_FETCHING_FILES
---------------------------------
 Finished fetching all the files
(1 row)

12.12 - FLUSH_REAPER_QUEUE

Deletes all data marked for deletion in the database.

Eon Mode only

Deletes all data marked for deletion in the database. Use this function to remove all data marked for deletion before the reaper service deletes disk files.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

FLUSH_REAPER_QUEUE( [sync-catalog] )

Parameters

*sync-catalog*

Specifies to sync metadata in the database catalog on all nodes before the function executes:

true (default): Sync the database catalog
false: Run without syncing.

Privileges

Superuser

Examples

Remove all files that are marked for deletion:

=> SELECT FLUSH_REAPER_QUEUE();
                 FLUSH_REAPER_QUEUE
-----------------------------------------------------
 Sync'd catalog and deleted all files in the reaper queue.
(1 row)

12.13 - MIGRATE_ENTERPRISE_TO_EON

Migrates an Enterprise database to an Eon Mode database.

Enterprise Mode only

Migrates an Enterprise database to an Eon Mode database. MIGRATE_ENTERPRISE_TO_EON runs in the foreground; until it returns—either with success or an error—it blocks all operations in the same session on the source Enterprise database. If successful, MIGRATE_ENTERPRISE_TO_EON returns with a list of nodes in the migrated database.

If migration is interrupted before the meta-function returns—for example, the client disconnects, or a network outage occurs—the migration returns an error. In this case, call MIGRATE_ENTERPRISE_TO_EON again to restart migration. For details, see Handling Interrupted Migration.

You can repeat migration multiple times to the same communal storage location—for example, to capture changes that occurred in the source database during the previous migration. For details, see Repeating Migration.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

MIGRATE_ENTERPRISE_TO_EON ( 'communal-storage-location', 'depot-location' [, is-dry-run] )

communal-storage-location

URI of communal storage location. For URI syntax examples for each supported schema, see File systems and object stores.

depot-location

Path of Eon depot location, typically:

/vertica/depot

Important

Management Console requires this convention to enable access to depot data and activity.

is-dry-run

Boolean. If set to true, MIGRATE_ENTERPRISE_TO_EON only checks whether the Enterprise source database complies with all migration prerequisites. If the meta-function discovers any compliance issues, it writes these to the migration error log migrate_enterprise_to_eon_error.log in the database directory.

Default: false

Privileges

Superuser

Examples

Migrate an Enterprise database to Eon Mode on AWS:

=> SELECT MIGRATE_ENTERPRISE_TO_EON ('s3://verticadbbucket', '/vertica/depot');
                      migrate_enterprise_to_eon
---------------------------------------------------------------------
 v_vmart_node0001,v_vmart_node0002,v_vmart_node0003,v_vmart_node0004
(1 row)

12.14 - PROMOTE_SUBCLUSTER_TO_PRIMARY

Converts a secondary subcluster to a.

Eon Mode only

Converts a secondary subcluster to a primary subcluster. You cannot use this function to promote the subcluster that contains the initiator node. You must call it while connected to a node in another subcluster.

Important

Schedule calls to this function to occur when other database activity is low.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

PROMOTE_SUBCLUSTER_TO_PRIMARY('subcluster-name')

Parameters

subcluster-name: The name of the secondary cluster to promote to a primary subcluster.

Privileges

Superuser

Examples

The following example promotes the subcluster named analytics_cluster to a primary cluster:

=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
  subcluster_name  | is_primary
-------------------+------------
 analytics_cluster | f
 load_subcluster   | t
(2 rows)


=> SELECT PROMOTE_SUBCLUSTER_TO_PRIMARY('analytics_cluster');
 PROMOTE_SUBCLUSTER_TO_PRIMARY
-------------------------------
 PROMOTE SUBCLUSTER TO PRIMARY
(1 row)


=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
  subcluster_name  | is_primary
-------------------+------------
 analytics_cluster | t
 load_subcluster   | t
(2 rows)

12.15 - REBALANCE_SHARDS

Rebalances shard assignments in a subcluster or across the entire cluster in Eon Mode.

Eon Mode only

Rebalances shard assignments in a subcluster or across the entire cluster in Eon Mode. If the current session ends, the operation immediately aborts. The amount of time required to rebalance shards scales in a roughly linear fashion based on the number of objects in your database.

Run REBALANCE_SHARDS after you modify your cluster using ALTER NODE or when you add nodes to a subcluster.

Note

Vertica rebalances shards in a subcluster automatically when you:

Remove a node from a subcluster.
Add a new subcluster with the admintools command db_add_subcluster with the -s option followed by a list of hosts.

After you rebalance shards, you will no longer be able to restore objects from a backup taken before the rebalancing. (Full backups are always possible.) After you rebalance, make another full backup so you will be able to restore objects from it in the future.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

REBALANCE_SHARDS(['subcluster-name'])

Parameters

subcluster-name: The name of the subcluster where shards will be rebalanced. If you do not supply this parameter, all subclusters in the database rebalance their shards.

Privileges

Superuser

Examples

The following shows that the nodes in the in the newly-added analytics subcluster do not yet have shard subscriptions. It then calls REBALANCE_SHARDS to update the node's subscriptions:

=> SELECT subcluster_name, n.node_name, shard_name, subscription_state FROM
   v_catalog.nodes n LEFT JOIN v_catalog.node_subscriptions ns ON (n.node_name
   = ns.node_name) ORDER BY 1,2,3;

   subcluster_name    |      node_name       | shard_name  | subscription_state
----------------------+----------------------+-------------+--------------------
 analytics_subcluster | v_verticadb_node0004 |             |
 analytics_subcluster | v_verticadb_node0005 |             |
 analytics_subcluster | v_verticadb_node0006 |             |
 default_subcluster   | v_verticadb_node0001 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0001 | segment0001 | ACTIVE
 default_subcluster   | v_verticadb_node0001 | segment0003 | ACTIVE
 default_subcluster   | v_verticadb_node0002 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0002 | segment0001 | ACTIVE
 default_subcluster   | v_verticadb_node0002 | segment0002 | ACTIVE
 default_subcluster   | v_verticadb_node0003 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0003 | segment0002 | ACTIVE
 default_subcluster   | v_verticadb_node0003 | segment0003 | ACTIVE
(12 rows)


=> SELECT REBALANCE_SHARDS('analytics_subcluster');
 REBALANCE_SHARDS
-------------------
 REBALANCED SHARDS
(1 row)

=> SELECT subcluster_name, n.node_name, shard_name, subscription_state FROM
   v_catalog.nodes n LEFT JOIN v_catalog.node_subscriptions ns ON (n.node_name
   = ns.node_name) ORDER BY 1,2,3;

   subcluster_name    |      node_name       | shard_name  | subscription_state
----------------------+----------------------+-------------+--------------------
 analytics_subcluster | v_verticadb_node0004 | replica     | ACTIVE
 analytics_subcluster | v_verticadb_node0004 | segment0001 | ACTIVE
 analytics_subcluster | v_verticadb_node0004 | segment0003 | ACTIVE
 analytics_subcluster | v_verticadb_node0005 | replica     | ACTIVE
 analytics_subcluster | v_verticadb_node0005 | segment0001 | ACTIVE
 analytics_subcluster | v_verticadb_node0005 | segment0002 | ACTIVE
 analytics_subcluster | v_verticadb_node0006 | replica     | ACTIVE
 analytics_subcluster | v_verticadb_node0006 | segment0002 | ACTIVE
 analytics_subcluster | v_verticadb_node0006 | segment0003 | ACTIVE
 default_subcluster   | v_verticadb_node0001 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0001 | segment0001 | ACTIVE
 default_subcluster   | v_verticadb_node0001 | segment0003 | ACTIVE
 default_subcluster   | v_verticadb_node0002 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0002 | segment0001 | ACTIVE
 default_subcluster   | v_verticadb_node0002 | segment0002 | ACTIVE
 default_subcluster   | v_verticadb_node0003 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0003 | segment0002 | ACTIVE
 default_subcluster   | v_verticadb_node0003 | segment0003 | ACTIVE
(18 rows)

12.16 - SET_DEPOT_PIN_POLICY_PARTITION

Pins the specified partitions of a table or projection to a subcluster depot, or all database depots, to reduce exposure to depot eviction.

Eon Mode only

Pins the specified partitions of a table or projection to a subcluster depot, or all database depots, to reduce exposure to depot eviction.

Partition groups can be pinned only if all partitions within the group are pinned individually. If you alter or remove table partitioning, Vertica drops all partition pinning policies for that table. The table's pinning policy, if any, is unaffected.

For details on pinning policies and usage guidelines, see Pinning Depot Objects.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SET_DEPOT_PIN_POLICY_PARTITION ( '[[database.]schema.]object-name', 'min-range-value', 'max-range-value' [, 'subcluster' ] [, 'download' ] )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
object-name: Table or projection to pin. If you specify a projection, it must store the partition keys.

Note
After you pin a table or one of its projections to a subcluster, you cannot subsequently pin any of its partitions to that subcluster. Conversely, you can pin one or more table partitions to a subcluster, and then pin the table or one of its projections to that subcluster.
min-range-value max-range-value: Minimum and maximum value of partition keys in object-name to pin, where min-range-value must be ≤ max-range-value. To specify a single partition, min-range-value and max-range-value must be equal.

Note
If partition pinning policies on the same table specify overlapping key ranges, Vertica collates the partition ranges. For example, if you create two partition policies with key ranges of 1-3 and 2-4, Vertica creates a single pinning policy with a key range of 1-4.
subcluster: Sets this pinning policy on the subcluster depot. To set this policy on the default subcluster, specify default_subcluster. If you omit this parameter, the policy is set on all database depots.
download: Boolean, if set to true, SET_DEPOT_PIN_POLICY_PARTITION immediately queues the specified partitions for download from communal storage.
Default: false

Privileges

Superuser

Precedence of pinning policies

In general, partition management functions that involve two partitioned tables give precedence to the target table's pinning policy, as follows:

Function	Application of pinnning policy
COPY_PARTITIONS_TO_TABLE	Partition-level pinning is reliable if the source and target tables have pinning policies on the same partition keys. If the two tables have different pinning policies, then the partition pinning policies of the target table apply.
MOVE_PARTITIONS_TO_TABLE	Partition-level pinning policies of the target table apply.
SWAP_PARTITIONS_BETWEEN_TABLES	Partition-level pinning policies of the target table apply.

For example, the following statement copies partitions from table foo to table bar:

=> SELECT COPY_PARTITIONS_TO_TABLE('foo', '1', '5', 'bar');

In this case, the following logic applies:

If the two tables have different partition pinning policies, then the pinning policy of target table bar for partition keys 1-5 applies.
If table bar does not exist, then Vertica creates it from table foo, and copies foo's policy on partition keys 1-5. Subsequently, if you clear the partition pinning policy from either table, it is also cleared from the other.

12.17 - SET_DEPOT_PIN_POLICY_PROJECTION

Pins a projection to a subcluster depot, or all database depots, to reduce its exposure to depot eviction.

Eon Mode only

Pins a projection to a subcluster depot, or all database depots, to reduce its exposure to depot eviction. For details on pinning policies and usage guidelines, see Pinning Depot Objects.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SET_DEPOT_PIN_POLICY_PROJECTION ( '[[database.]schema.]projection' [, 'subcluster' ] [, download ] )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
projection: Projection to pin.

Note
After you pin a table to a subcluster, you cannot subsequently pin any of its projections to that subcluster. Conversely, you can pin one or more projections of a table to a subcluster, and then pin the table to that subcluster.
subcluster: Sets this pinning policy on the subcluster depot. To set this policy on the default subcluster, specify default_subcluster. If you omit this parameter, the policy is set on all database depots.
download: Boolean, if set to true SET_DEPOT_PIN_POLICY_PROJECTION immediately queues the specified projection for download from communal storage.
Default: false

Privileges

Superuser

12.18 - SET_DEPOT_PIN_POLICY_TABLE

Pins a table to a subcluster depot, or all database depots, to reduce its exposure to depot eviction.

Eon Mode only

Pins a table to a subcluster depot, or all database depots, to reduce its exposure to depot eviction. For details on pinning policies and usage guidelines, see Pinning Depot Objects.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SET_DEPOT_PIN_POLICY_TABLE ( '[[database.]schema.]table' [, 'subcluster' ] [, download ] )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table: Table to pin.

Note
After you pin a table to a subcluster, you cannot subsequently pin any of its partitions or projections to that subcluster. Conversely, you can pin one or more partitions or projections of a table to a subcluster, and then pin the table to that subcluster.
subcluster: Sets this pinning policy on the subcluster depot. To set this policy on the default subcluster, specify default_subcluster. If you omit this parameter, the policy is set on all database depots.
download: Boolean, if set to true, SET_DEPOT_PIN_POLICY_TABLE immediately queues the specified table for download from communal storage.
Default: false

Privileges

Superuser

12.19 - SHUTDOWN_SUBCLUSTER

Shuts down a subcluster.

Eon Mode only

Shuts down a subcluster. This function shuts down the subcluster synchronously, returning when shutdown is complete with the message Subcluster shutdown. If the subcluster is already down, the function returns with no error.

Caution

This function does not test whether the target subcluster is critical (a subcluster whose loss would cause the database to shut down). Using this function to shut down a critical subcluster results in the database shutting down. Always verify that the subcluster you want to shut down is not critical by querying the CRITICAL_SUBCLUSTERS system table before calling this function.

Important

Stopping a subcluster does not warn you if there are active user sessions connected to the subcluster. This behavior is the same as stopping an individual node. Before stopping a subcluster, verify that no users are connected to it.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SHUTDOWN_SUBCLUSTER('subcluster-name')

Arguments

subcluster-name: Name of the subcluster to shut down.

Privileges

Superuser

Examples

The following example demonstrates shutting down the subcluster analytics:

=> SELECT subcluster_name, node_name, node_state FROM nodes order by 1,2;
  subcluster_name   |      node_name       | node_state
--------------------+----------------------+------------
 analytics          | v_verticadb_node0004 | UP
 analytics          | v_verticadb_node0005 | UP
 analytics          | v_verticadb_node0006 | UP
 default_subcluster | v_verticadb_node0001 | UP
 default_subcluster | v_verticadb_node0002 | UP
 default_subcluster | v_verticadb_node0003 | UP
(6 rows)

=> SELECT SHUTDOWN_SUBCLUSTER('analytics');
WARNING 4539:  Received no response from v_verticadb_node0004 in stop subcluster
WARNING 4539:  Received no response from v_verticadb_node0005 in stop subcluster
WARNING 4539:  Received no response from v_verticadb_node0006 in stop subcluster
 SHUTDOWN_SUBCLUSTER
---------------------
 Subcluster shutdown
(1 row)

=> SELECT subcluster_name, node_name, node_state FROM nodes order by 1,2;
  subcluster_name   |      node_name       | node_state
--------------------+----------------------+------------
 analytics          | v_verticadb_node0004 | DOWN
 analytics          | v_verticadb_node0005 | DOWN
 analytics          | v_verticadb_node0006 | DOWN
 default_subcluster | v_verticadb_node0001 | UP
 default_subcluster | v_verticadb_node0002 | UP
 default_subcluster | v_verticadb_node0003 | UP
(6 rows)

Note

The "WARNING 4539" messages after calling SHUTDOWN_SUBCLUSTER occur because the nodes are in the process of shutting down. They are expected.

12.20 - START_REAPING_FILES

Starts the disk file deletion in the background as an asynchronous function.

Eon Mode only

Starts the disk file deletion in the background as an asynchronous function. By default, this meta-function syncs the catalog before beginning deletion. Disk file deletion is handled in the foreground by FLUSH_REAPER_QUEUE.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

START_REAPING_FILES( [sync-catalog] )

Parameters

*sync-catalog*

Specifies to sync metadata in the database catalog on all nodes before the function executes:

true (default): Sync the database catalog
false: Run without syncing.

Privileges

Superuser

Examples

Start the reaper service:

=> SELECT START_REAPING_FILES();

Start the reaper service and skip the initial catalog sync:

=> SELECT START_REAPING_FILES(false);

12.21 - SYNC_CATALOG

Synchronizes the catalog to communal storage to enable reviving the current catalog version in the case of an imminent crash.

Eon Mode only

Synchronizes the catalog to communal storage to enable reviving the current catalog version in the case of an imminent crash. Vertica synchronizes all pending checkpoint and transaction logs to communal storage.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SYNC_CATALOG( [ 'node-name' ] )

Parameters

node-name: The node to synchronize. If you omit this argument, Vertica synchronizes the catalog on all nodes.

Privileges

Superuser

Examples

Synchronize the catalog on all nodes:

=> SELECT SYNC_CATALOG();

Synchronize the catalog on one node:

=> SELECT SYNC_CATALOG( 'node001' );

13 - Epoch management functions

This section contains the epoch management functions specific to Vertica.

13.1 - ADVANCE_EPOCH

Manually closes the current epoch and begins a new epoch.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

ADVANCE_EPOCH ( [ integer ] )

Parameters

integer: Specifies the number of epochs to advance.

Privileges

Superuser

Notes

This function is primarily maintained for backward compatibility with earlier versions of Vertica.

Examples

The following command increments the epoch number by 1:

=> SELECT ADVANCE_EPOCH(1);

13.2 - GET_AHM_EPOCH

Returns the number of the in which the is located.

Returns the number of the epoch in which the Ancient History Mark is located. Data deleted up to and including the AHM epoch can be purged from physical storage.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

GET_AHM_EPOCH()

Note

The AHM epoch is 0 (zero) by default (purge is disabled).

Privileges

None

Examples

=> SELECT GET_AHM_EPOCH();
    GET_AHM_EPOCH
----------------------
 Current AHM epoch: 0
(1 row)

13.3 - GET_AHM_TIME

Returns a TIMESTAMP value representing the.

Returns a TIMESTAMP value representing the Ancient History Mark. Data deleted up to and including the AHM epoch can be purged from physical storage.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

GET_AHM_TIME()

Privileges

None

Examples

=> SELECT GET_AHM_TIME();
                  GET_AHM_TIME
-------------------------------------------------
 Current AHM Time: 2010-05-13 12:48:10.532332-04
(1 row)

13.4 - GET_CURRENT_EPOCH

Returns the number of the current epoch.

The epoch into which data (COPY, INSERT, UPDATE, and DELETE operations) is currently being written.

Returns the number of the current epoch.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

GET_CURRENT_EPOCH()

Privileges

None

Examples

=> SELECT GET_CURRENT_EPOCH();
 GET_CURRENT_EPOCH
-------------------
               683
(1 row)

13.5 - GET_LAST_GOOD_EPOCH

Returns the number.

Returns the last good epoch number. If the database has no projections, the function returns an error.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

GET_LAST_GOOD_EPOCH()

Privileges

None

Examples

=> SELECT GET_LAST_GOOD_EPOCH();
 GET_LAST_GOOD_EPOCH
---------------------
                 682
(1 row)

13.6 - MAKE_AHM_NOW

Sets the (AHM) to the greatest allowable value.

Sets the Ancient History Mark (AHM) to the greatest allowable value. This lets you purge all deleted data.

Caution

After running this function, you cannot query historical data that precedes the current epoch. Only database administrators should use this function.

MAKE_AHM_NOW performs the following operations:

Advances the epoch.
Sets the AHM to the last good epoch (LGE) — at least to the epoch that is current when you execute MAKE_AHM_NOW.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

MAKE_AHM_NOW ( [ true ] )

Parameters

true

Allows AHM to advance when one of the following conditions is true:

One or more nodes are down.
One projection is being refreshed from another (retentive refresh).

In both cases , you must supply this argument to MAKE_AHM_NOW, otherwise Vertica returns an error. If you execute MAKE_AHM_NOW(true) during retentive refresh, Vertica rolls back the refresh operation and advances the AHM.

Caution

If the function advances AHM beyond the last good epoch of the down nodes, those nodes must recover all data from scratch.

Privileges

Superuser

Setting AHM when nodes are down

If any node in the cluster is down, you must call MAKE_AHM_NOW with an argument of true; otherwise, the function returns an error.

Note

This requirement applies only to Enterprise mode; in Eon mode, it is ignored.

In the following example, MAKE_AHM_NOW advances the AHM even though a node is down:

=> SELECT MAKE_AHM_NOW(true);
WARNING:  Received no response from v_vmartdb_node0002 in get cluster LGE
WARNING:  Received no response from v_vmartdb_node0002 in get cluster LGE
WARNING:  Received no response from v_vmartdb_node0002 in set AHM
         MAKE_AHM_NOW
------------------------------
 AHM set (New AHM Epoch: 684)
(1 row)

13.7 - SET_AHM_EPOCH

Sets the (AHM) to the specified epoch.

Sets the Ancient History Mark (AHM) to the specified epoch. This function allows deleted data up to and including the AHM epoch to be purged from physical storage.

SET_AHM_EPOCH is normally used for testing purposes. Instead, consider using SET_AHM_TIME which is easier to use.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SET_AHM_EPOCH ( epoch, [ true ] )

Parameters

epoch

Specifies one of the following:

The number of the epoch in which to set the AHM
Zero (0) (the default) disables PURGE

Important

The number of the specified epoch must be:

Greater than the current AHM epoch
Less than the current epoch

Query the SYSTEM table to view current epoch values relative to the AHM.

true

Allows the AHM to advance when nodes are down.

Caution

If you advance AHM beyond the last good epoch of the down nodes, those nodes must recover all data from scratch.

Privileges

Superuser

Setting AHM when nodes are down

If any node in the cluster is down, you must call SET_AHM_EPOCH with an argument of true; otherwise, the function returns an error.

Note

This requirement applies only to Enterprise mode; in Eon mode, it is ignored.

Examples

The following command sets the AHM to a specified epoch of 12:

=> SELECT SET_AHM_EPOCH(12);

The following command sets the AHM to a specified epoch of 2 and allows the AHM to advance despite a failed node:

=> SELECT SET_AHM_EPOCH(2, true);

13.8 - SET_AHM_TIME

Sets the (AHM) to the epoch corresponding to the specified time on the initiator node.

Sets the Ancient History Mark (AHM) to the epoch corresponding to the specified time on the initiator node. This function allows historical data up to and including the AHM epoch to be purged from physical storage. SET_AHM_TIME returns a TIMESTAMPTZ that represents the end point of the AHM epoch.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SET_AHM_TIME ( time, [ true ] )

Parameters

time: A TIMESTAMP/TIMESTAMPTZ value that is automatically converted to the appropriate epoch number.
true: Allows the AHM to advance when nodes are down.

Caution
If you advance AHM beyond the last good epoch of the down nodes, those nodes must recover all data from scratch.

Privileges

Superuser

Setting AHM when nodes are down

If any node in the cluster is down, you must call SET_AHM_TIME with an argument of true; otherwise, the function returns an error.

Note

This requirement applies only to Enterprise mode; in Eon mode, it is ignored.

Examples

Epochs depend on a configured epoch advancement interval. If an epoch includes a three-minute range of time, the purge operation is accurate only to within minus three minutes of the specified timestamp:

=> SELECT SET_AHM_TIME('2008-02-27 18:13');
           set_ahm_time
------------------------------------
 AHM set to '2008-02-27 18:11:50-05'
(1 row)

Note

The –05 part of the output string is a time zone value, an offset in hours from UTC (Universal Coordinated Time, traditionally known as Greenwich Mean Time, or GMT).

In the previous example, the actual AHM epoch ends at 18:11:50, roughly one minute before the specified timestamp. This is because SET_AHM_TIME selects the epoch that ends at or before the specified timestamp. It does not select the epoch that ends after the specified timestamp because that would purge data deleted as much as three minutes after the AHM.

For example, using only hours and minutes, suppose that epoch 9000 runs from 08:50 to 11:50 and epoch 9001 runs from 11:50 to 15:50. SET_AHM_TIME('11:51') chooses epoch 9000 because it ends roughly one minute before the specified timestamp.

In the next example, suppose that a node went down at 11:00:00 AM on January 1st 2017. At noon, you want to advance the AHM to 11:15:00, but the node is still down.

Suppose you try to set the AHM using this command:

=> SELECT SET_AHM_TIME('2017-01-01 11:15:00');

Then you will receive an error message. Vertica prevents you from moving the AHM past the point where a node went down. Vertica returns this error to prevent the AHM from advancing past the down node's last good epoch. You can force the AHM to advance by supplying the optional second parameter:

=> SELECT SET_AHM_TIME('2017-01-01 11:15:00', true);

However, if you force the AHM past the last good epoch, the failed node will have to recover from scratch.

14 - Flex table data functions

The flex table data helper functions supply information you need to directly query data in flex tables.

The flex table data helper functions supply information you need to directly query data in flex tables. After you compute keys and create views from the raw data, you can use field names directly in queries instead of using map functions to extract data.

In addition to these data meta-functions, there are flex functions that are not meta-functions.

Function	Description
COMPUTE_FLEXTABLE_KEYS	Computes map keys from the map data in a flex table and populates a keys table with the results. Use this function before building a view.
BUILD_FLEXTABLE_VIEW	Uses the keys in a table to create a view definition for the source table. Use this function after computing flex table keys.
COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW	Performs both of the preceding functions in one call.
MATERIALIZE_FLEXTABLE_COLUMNS	Materializes a specified number of columns.
RESTORE_FLEXTABLE_DEFAULT_KEYS_TABLE_AND_VIEW	Replaces the `flextable_data_keys` table and the `flextable_data_view` view, linking both the keys table and the view to the parent flex table.

Flex table dependencies

Each flex table has two dependent objects, a keys table and a view. While both objects are dependent on their parent table, you can drop either object independently. Dropping the parent table removes both dependents, without a CASCADE option.

Associating flex tables and views

The helper functions automatically use the dependent table and view if they are internally linked with the parent table. You create both when you create the flex table. You can drop either the keys table or the view and re-create objects of the same name. However, if you do so, the new objects are not internally linked with the parent flex table.

In this case, you can restore the internal links of these objects to the parent table. To do so, drop the keys table and the view before calling the RESTORE_FLEXTABLE_DEFAULT_KEYS_TABLE_AND_VIEW function. Calling this function re-creates the keys table and view.

The remaining helper functions perform the tasks described in this section.

14.1 - BUILD_FLEXTABLE_VIEW

Creates, or re-creates, a view for a default or user-defined keys table, ignoring any empty keys.

Note

If the length of a key exceeds 65,000, Vertica truncates the key.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

BUILD_FLEXTABLE_VIEW ('[[database.]schema.]flex-table'
    [ [,'view-name'] [,'user-keys-table'] ])

Arguments

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
flex-table: The flex table name. By default, this function builds or rebuilds a view for the input table with the current contents of the associated flex_table_keys table.
view-name: A custom view name. Use this option to build a new view for flex-table with the name you specify.
user-keys-table: Name of a keys table from which to create the view. Use this option if you created a custom keys table from the flex table map data, rather than from the default flex_table_keys table. The function builds a view from the keys in user_keys, rather than from flex_table_keys.

Examples

The following examples show how to call BUILD_FLEXTABLE_VIEW with 1, 2, or 3 arguments.

To create, or re-create, a default view:

Call the function with an input flex table:

=> SELECT BUILD_FLEXTABLE_VIEW('darkdata');
                  build_flextable_view
-----------------------------------------------------
 The view public.darkdata_view is ready for querying
(1 row)

The function creates a view with the default name (darkdata_view) from the darkdata_keys table.

Query a key name from the new or updated view:

=> SELECT "user.id" FROM darkdata_view;
  user.id
-----------
 340857907
 727774963
 390498773
 288187825
 164464905
 125434448
 601328899
 352494946
(12 rows)

To create, or re-create, a view with a custom name:

Call the function with two arguments, an input flex table, darkdata, and the name of the view to create, dd_view:

=> SELECT BUILD_FLEXTABLE_VIEW('darkdata', 'dd_view');
            build_flextable_view
-----------------------------------------------
 The view public.dd_view is ready for querying
(1 row)

Query a key name (user.lang) from the new or updated view (dd_view):

=> SELECT "user.lang" FROM dd_view;
 user.lang
-----------
 tr
 en
 es
 en
 en
 it
 es
 en
(12 rows)

To create a view from a custom keys table with BUILD_FLEXTABLE_VIEW, the custom table must have the same schema and table definition as the default table (darkdata_keys). Create a custom keys table, using any of these three approaches:

Create a columnar table with all keys from the default keys table for a flex table (darkdata_keys):
```
=> CREATE TABLE new_darkdata_keys AS SELECT * FROMdarkdata_keys;
CREATE TABLE
```

Create a columnar table without content (LIMIT 0) from the default keys table for a flex table (darkdata_keys):

=> CREATE TABLE new_darkdata_keys AS SELECT * FROM darkdata_keys LIMIT 0;
CREATE TABLE
kdb=> SELECT * FROM new_darkdata_keys;
 key_name | frequency | data_type_guess
----------+-----------+-----------------
(0 rows)

Create a columnar table without content (LIMIT 0) from the default keys table, and insert two values ('user.lang', 'user.name') into the key_name column:

=> CREATE TABLE dd_keys AS SELECT * FROM darkdata_keys limit 0;
CREATE TABLE
=> INSERT INTO dd_keys (key_name) values ('user.lang');
 OUTPUT
--------
      1
(1 row)
=> INSERT INTO dd_keys (key_name) values ('user.name');
 OUTPUT
--------
      1
(1 row)
=> SELECT * FROM dd_keys;
 key_name  | frequency | data_type_guess
-----------+-----------+-----------------
 user.lang |           |
 user.name |           |
(2 rows)

After creating a custom keys table, call BUILD_FLEXTABLE_VIEW with all arguments (an input flex table, the new view name, the custom keys table):

=> SELECT BUILD_FLEXTABLE_VIEW('darkdata', 'dd_view', 'dd_keys');
            build_flextable_view
-----------------------------------------------
 The view public.dd_view is ready for querying
(1 row)

Query the new view:

=> SELECT * FROM dd_view;

14.2 - COMPUTE_FLEXTABLE_KEYS

Computes the virtual columns (keys and values) from the flex table VMap data.

Computes the virtual columns (keys and values) from the flex table VMap data. Use this function to compute keys without creating an associated table view. To also build a view, use COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW.

Note

If the length of a key exceeds 65,000, Vertica truncates the key.

The function stores its results in the associated flex keys table (flexTableName_keys), which has the following columns:

key_name
frequency
data_type_guess

For more information, see Computing flex table keys.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

COMPUTE_FLEXTABLE_KEYS ('[[database.]schema.]flex-table')

Arguments

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
*flex-table*: Name of a flex table.

Using data type guessing

The results in the data_type_guess column depend on the EnableBetterFlexTypeGuessing configuration parameter. By default, the parameter is 1 (ON). This setting results in the function returning all non-string keys in the data_type_guess column as one of the following types (and others listed in Data types):

BOOLEAN
INTEGER
FLOAT
TIMESTAMP
DATE

Setting the configuration parameter to 0 (OFF) results in the function returning only string types ([LONG]VARCHAR) or ([LONG] VARBINARY) for all values in the data_type_guess column of the keys table .

Assigning flex key data types

Use the sample CSV data in this section to compare the results of using or not using the EnableBetterFlexTypeGuessing configuration parameter. When the parameter is ON, the function determines key non-string data types in your map data more accurately. The default for the parameter is 1 (ON).

Year,Quarter,Region,Species,Grade,Pond Value,Number of Quotes,Available
2015,1,2 - Northwest Oregon & Willamette,Douglas-fir,1P,$615.12 ,12,No
2015,1,2 - Northwest Oregon & Willamette,Douglas-fir,SM,$610.78 ,12,Yes
2015,1,2 - Northwest Oregon & Willamette,Douglas-fir,2S,$596.00 ,20,Yes
2015,1,2 - Northwest Oregon & Willamette,Hemlock,P,$520.00 ,6,Yes
2015,1,2 - Northwest Oregon & Willamette,Hemlock,SM,$510.00 ,6,No
2015,1,2 - Northwest Oregon & Willamette,Hemlock,2S,$490.00 ,14,No

To compare the data type assignment results, complete the following steps:

Save the CSV data file (here, as trees.csv).

Create a flex table (trees) and load trees.csv using the fcsvparser:

=> CREATE FLEX TABLE trees();
=> COPY trees FROM '/home/dbadmin/tempdat/trees.csv' PARSER fcsvparser();

Use COMPUTE_FLEXTABLE_KEYS with the trees flex table.

=> SELECT COMPUTE_FLEXTABLE_KEYS('trees');
            COMPUTE_FLEXTABLE_KEYS
-----------------------------------------------
 Please see public.trees_keys for updated keys
(1 row)

Query the trees_keys table output.:

=> SELECT * FROM trees_keys;
     key_name     | frequency | data_type_guess
------------------+-----------+-----------------
 Year             |         6 | Integer
 Quarter          |         6 | Integer
 Region           |         6 | Varchar(66)
 Available        |         6 | Boolean
 Number of Quotes |         6 | Integer
 Grade            |         6 | Varchar(20)
 Species          |         6 | Varchar(22)
 Pond Value       |         6 | Numeric(8,3)
(8 rows)

Set the EnableBetterFlexTypeGuessing parameter to 0 (OFF).
Call COMPUTE_FLEXTABLE_KEYS with the trees flex table again.

Query the trees_keys table to compare the data_type_guess values with the previous results. Without the configuration parameter set, all of the non-string data types are VARCHARS of various lengths:


=> SELECT * FROM trees_keys;
    key_name     | frequency | data_type_guess
------------------+-----------+-----------------
 Year             |         6 | varchar(20)
 Quarter          |         6 | varchar(20)
 Region           |         6 | varchar(66)
 Available        |         6 | varchar(20)
 Grade            |         6 | varchar(20)
 Number of Quotes |         6 | varchar(20)
 Pond Value       |         6 | varchar(20)
 Species          |         6 | varchar(22)
(8 rows)

To maintain accurate results for non-string data types, set the EnableBetterFlexTypeGuessing parameter back to 1 (ON).

For more information about the EnableBetterFlexTypeGuessing configuration parameter, see EnableBetterFlexTypeGuessing.

14.3 - COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW

Combines the functionality of BUILD_FLEXTABLE_VIEW and COMPUTE_FLEXTABLE_KEYS to compute virtual columns (keys) from the VMap data of a flex table and construct a view.

Combines the functionality of BUILD_FLEXTABLE_VIEW and COMPUTE_FLEXTABLE_KEYS to compute virtual columns (keys) from the VMap data of a flex table and construct a view. Creating a view with this function ignores empty keys. If you do not need to perform both operations together, use one of the single-operation functions instead.

Note

If the length of a key exceeds 65,000, Vertica truncates the key.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW ('flex-table')

Arguments

flex-table: Name of a flex table

Examples

This example shows how to call the function for the darkdata flex table.

=> SELECT COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW('darkdata');
               compute_flextable_keys_and_build_view
-----------------------------------------------------------------------
 Please see public.darkdata_keys for updated keys
The view public.darkdata_view is ready for querying
(1 row)

14.4 - MATERIALIZE_FLEXTABLE_COLUMNS

Materializes virtual columns listed as key_names in the flextable_keys table you compute using either COMPUTE_FLEXTABLE_KEYS or COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW.

Materializes virtual columns listed as key_names in the flextable_keys table you compute using either COMPUTE_FLEXTABLE_KEYS or COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW.

Note

Each column that you materialize with this function counts against the data storage limit of your license. To check your Vertica license compliance, call the AUDIT() or AUDIT_FLEX() functions.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

MATERIALIZE_FLEXTABLE_COLUMNS ('[[database.]schema.]flex-table' [, n-columns [, keys-table-name] ])

Arguments

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

flex-table

The name of the flex table with columns to materialize. The function:

Skips any columns already materialized
Ignores any empty keys

n-columns

The number of columns to materialize, up to 9800. The function attempts to materialize the number of columns from the keys table, skipping any columns already materialized. It orders the materialized results by frequency, descending. If not specified, the default is a maximum of 50 columns.

keys-table-name

The name of a keys from which to materialize columns. The function:

Materializes n-columns columns from the keys table
Skips any columns already materialized
Orders the materialized results by frequency, descending

Examples

The following example shows how to call MATERIALIZE_FLEXTABLE_COLUMNS to materialize columns. First, load a sample file of tweets (tweets_10000.json) into the flex table twitter_r. After loading data and computing keys for the sample flex table, call MATERIALIZE_FLEXTABLE_COLUMNS to materialize the first four columns:

=> COPY twitter_r FROM '/home/release/KData/tweets_10000.json' parser fjsonparser();
 Rows Loaded
-------------
       10000
(1 row)

=> SELECT compute_flextable_keys ('twitter_r');
              compute_flextable_keys
---------------------------------------------------
 Please see public.twitter_r_keys for updated keys
(1 row)

=> SELECT MATERIALIZE_FLEXTABLE_COLUMNS('twitter_r', 4);
    MATERIALIZE_FLEXTABLE_COLUMNS
-------------------------------------------------------------------------------
 The following columns were added to the table public.twitter_r:
        contributors
        entities.hashtags
        entities.urls
For more details, run the following query:
SELECT * FROM v_catalog.materialize_flextable_columns_results WHERE table_schema = 'public' and table_name = 'twitter_r';

(1 row)

The last message in the example recommends querying the MATERIALIZE_FLEXTABLE_COLUMNS_RESULTS system table for the results of materializing the columns, as shown:

=> SELECT * FROM v_catalog.materialize_flextable_columns_results WHERE table_schema = 'public' and table_name = 'twitter_r';
table_id           | table_schema | table_name |      creation_time           |     key_name      | status |    message
-------------------+--------------+------------+------------------------------+-------------------+--------+---------------------
 45035996273733172 | public       | twitter_r  | 2013-11-20 17:00:27.945484-05| contributors      | ADDED  | Added successfully
 45035996273733172 | public       | twitter_r  | 2013-11-20 17:00:27.94551-05 | entities.hashtags | ADDED  | Added successfully
 45035996273733172 | public       | twitter_r  | 2013-11-20 17:00:27.945519-05| entities.urls     | ADDED  | Added successfully
 45035996273733172 | public       | twitter_r  | 2013-11-20 17:00:27.945532-05| created_at        | EXISTS | Column of same name already
(4 rows)

14.5 - RESTORE_FLEXTABLE_DEFAULT_KEYS_TABLE_AND_VIEW

Restores the keys table and the view.

Restores the keys table and the view. The function also links the keys table with its associated flex table, in cases where either table is dropped. The function also indicates whether it restored one or both objects.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

RESTORE_FLEXTABLE_DEFAULT_KEYS_TABLE_AND_VIEW ('flex-table')

Arguments

flex-table: Name of a flex table

Examples

This example shows how to invoke this function with an existing flex table, restoring both the keys table and view:

=> SELECT RESTORE_FLEXTABLE_DEFAULT_KEYS_TABLE_AND_VIEW('darkdata');
                     RESTORE_FLEXTABLE_DEFAULT_KEYS_TABLE_AND_VIEW
----------------------------------------------------------------------------------
The keys table public.darkdata_keys was restored successfully.
The view public.darkdata_view was restored successfully.
(1 row)

This example illustrates that the function restored darkdata_view, but that darkdata_keys did not need restoring:

=> SELECT RESTORE_FLEXTABLE_DEFAULT_KEYS_TABLE_AND_VIEW('darkdata');
                    RESTORE_FLEXTABLE_DEFAULT_KEYS_TABLE_AND_VIEW
------------------------------------------------------------------------------------
 The keys table public.darkdata_keys already exists and is linked to darkdata.
 The view public.darkdata_view was restored successfully.
(1 row)

After restoring the keys table, there is no content. To populate the flex keys, call the COMPUTE_FLEXTABLE_KEYS function.

=> SELECT * FROM darkdata_keys;
 key_name | frequency | data_type_guess
----------+-----------+-----------------
(0 rows)

15 - Hadoop functions

This section contains functions to manage interactions with Hadoop.

15.1 - CLEAR_HDFS_CACHES

Clears the configuration information copied from HDFS and any cached connections.

This function affects reads using the hdfs scheme in the following ways:

This function flushes information loaded from configuration files copied from Hadoop (such as core-site.xml). These files are found on the path set by the HadoopConfDir configuration parameter.
This function flushes information about which NameNode is active in a High Availability (HA) Hadoop cluster. Therefore, the first request to Hadoop after calling this function is slower than expected.

Vertica maintains a cache of open connections to NameNodes to reduce latency. This function flushes that cache.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CLEAR_HDFS_CACHES ( )

Privileges

Superuser

Examples

The following example clears the Hadoop configuration information:

=> SELECT CLEAR_HDFS_CACHES();
 CLEAR_HDFS_CACHES
--------------
 Cleared
(1 row)

15.2 - EXTERNAL_CONFIG_CHECK

Tests the Hadoop configuration of a Vertica cluster.

Tests the Hadoop configuration of a Vertica cluster. This function tests HDFS configuration files, HCatalog Connector configuration, and Kerberos configuration.

This function calls the following functions:

If you call this function with an argument, it passes the argument to functions it calls that also take an argument.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

EXTERNAL_CONFIG_CHECK( ['what_to_test' ] )

Arguments

what_to_test: A string specifying the authorities, nameservices, and/or HCatalog schemas to test. The format is a comma-separated list of "key=value" pairs, where keys are "authority", "nameservice", and "schema". The value is passed to all of the sub-functions; see those reference pages for details on how values are interpreted.

Privileges

This function does not require privileges.

Examples

The following example tests the configuration of only the nameservice named "ns1". Output has been omitted due to length.

=> SELECT EXTERNAL_CONFIG_CHECK('nameservice=ns1');

15.3 - GET_METADATA

Returns the metadata of a Parquet file.

Returns the metadata of a Parquet file. Metadata includes the number and sizes of row groups, column names, and information about chunks and compression. Metadata is returned as JSON.

This function inspects one file. Parquet data usually spans many files in a single directory; choose one. The function does not accept a directory name as an argument.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

GET_METADATA( 'filename' )

Arguments

filename: The name of a Parquet file. Any path that is valid for COPY is valid for this function. This function does not operate on files in other formats.

Privileges

Superuser, or non-superuser with READ privileges on the USER-accessible storage location (see GRANT (storage location)).

Examples

You must call this function with a single file, not a directory or glob:

=> SELECT GET_METADATA('/data/emp-row.parquet');
                GET_METADATA
----------------------------------------------------------------------------------------------------
 schema:
required group field_id=-1 spark_schema {
  optional int32 field_id=-1 employeeID;
  optional group field_id=-1 personal {
    optional binary field_id=-1 name (String);
    optional group field_id=-1 address {
      optional binary field_id=-1 street (String);
      optional binary field_id=-1 city (String);
      optional int32 field_id=-1 zipcode;
    }
    optional int32 field_id=-1 taxID;
  }
  optional binary field_id=-1 department (String);
}

 data page version:
  data page v1

 metadata:
{
  "FileName": "/data/emp-row.parquet",
  "FileFormat": "Parquet",
  "Version": "1.0",
  "CreatedBy": "parquet-mr version 1.10.1 (build a89df8f9932b6ef6633d06069e50c9b7970bebd1)",
  "TotalRows": "4",
  "NumberOfRowGroups": "1",
  "NumberOfRealColumns": "3",
  "NumberOfColumns": "7",
  "Columns": [
     { "Id": "0", "Name": "employeeID", "PhysicalType": "INT32", "ConvertedType": "NONE", "LogicalType": {"Type": "None"} },
     { "Id": "1", "Name": "personal.name", "PhysicalType": "BYTE_ARRAY", "ConvertedType": "UTF8", "LogicalType": {"Type": "String"} },
     { "Id": "2", "Name": "personal.address.street", "PhysicalType": "BYTE_ARRAY", "ConvertedType": "UTF8", "LogicalType": {"Type": "String"} },
     { "Id": "3", "Name": "personal.address.city", "PhysicalType": "BYTE_ARRAY", "ConvertedType": "UTF8", "LogicalType": {"Type": "String"} },
     { "Id": "4", "Name": "personal.address.zipcode", "PhysicalType": "INT32", "ConvertedType": "NONE", "LogicalType": {"Type": "None"} },
     { "Id": "5", "Name": "personal.taxID", "PhysicalType": "INT32", "ConvertedType": "NONE", "LogicalType": {"Type": "None"} },
     { "Id": "6", "Name": "department", "PhysicalType": "BYTE_ARRAY", "ConvertedType": "UTF8", "LogicalType": {"Type": "String"} }
  ],
  "RowGroups": [
     {
       "Id": "0",  "TotalBytes": "642",  "TotalCompressedBytes": "0",  "Rows": "4",
       "ColumnChunks": [
          {"Id": "0", "Values": "4", "StatsSet": "True", "Stats": {"NumNulls": "0", "DistinctValues": "0", "Max": "51513", "Min": "17103" },
           "Compression": "SNAPPY", "Encodings": "PLAIN RLE BIT_PACKED ", "UncompressedSize": "67", "CompressedSize": "69" },
          {"Id": "1", "Values": "4", "StatsSet": "True", "Stats": {"NumNulls": "0", "DistinctValues": "0", "Max": "Sheldon Cooper", "Min": "Howard Wolowitz" },
           "Compression": "SNAPPY", "Encodings": "PLAIN RLE BIT_PACKED ", "UncompressedSize": "142", "CompressedSize": "145" },
          {"Id": "2", "Values": "4", "StatsSet": "True", "Stats": {"NumNulls": "0", "DistinctValues": "0", "Max": "52 Broad St", "Min": "100 Main St Apt 4A" },
           "Compression": "SNAPPY", "Encodings": "PLAIN RLE BIT_PACKED ", "UncompressedSize": "139", "CompressedSize": "123" },
          {"Id": "3", "Values": "4", "StatsSet": "True", "Stats": {"NumNulls": "0", "DistinctValues": "0", "Max": "Pasadena", "Min": "Pasadena" },
           "Compression": "SNAPPY", "Encodings": "RLE PLAIN_DICTIONARY BIT_PACKED ", "UncompressedSize": "95", "CompressedSize": "99" },
          {"Id": "4", "Values": "4", "StatsSet": "True", "Stats": {"NumNulls": "0", "DistinctValues": "0", "Max": "91021", "Min": "91001" },
           "Compression": "SNAPPY", "Encodings": "PLAIN RLE BIT_PACKED ", "UncompressedSize": "68", "CompressedSize": "70" },
          {"Id": "5", "Values": "4", "StatsSet": "True", "Stats": {"NumNulls": "4", "DistinctValues": "0", "Max": "0", "Min": "0" },
           "Compression": "SNAPPY", "Encodings": "PLAIN RLE BIT_PACKED ", "UncompressedSize": "28", "CompressedSize": "30" },
          {"Id": "6", "Values": "4", "StatsSet": "True", "Stats": {"NumNulls": "0", "DistinctValues": "0", "Max": "Physics", "Min": "Astronomy" },
           "Compression": "SNAPPY", "Encodings": "RLE PLAIN_DICTIONARY BIT_PACKED ", "UncompressedSize": "103", "CompressedSize": "107" }
        ]
     }
  ]
}

(1 row)

15.4 - HADOOP_IMPERSONATION_CONFIG_CHECK

Reports the delegation tokens Vertica will use when accessing Kerberized data in HDFS.

Reports the delegation tokens Vertica will use when accessing Kerberized data in HDFS. The HadoopImpersonationConfig configuration parameter specifies one or more authorities, nameservices, and HCatalog schemas and their associated tokens. For each tested value, the function reports what doAs user or delegation token Vertica will use for access. Use this function to confirm that you have defined your delegation tokens as you intended.

You can call this function with an argument to specify the authority, nameservice, or HCatalog schema to test, or without arguments to test all configured values.

This function does not check that you can use these delegation tokens to access HDFS.

See Proxy users and delegation tokens for more about impersonation.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

HADOOP_IMPERSONATION_CONFIG_CHECK( ['what_to_test' ] )

Arguments

what_to_test: A string specifying the authorities, nameservices, and/or HCatalog schemas to test. For example, a value of 'nameservice=ns1' means the function tests only access to the nameservice "ns1" and ignores any other authorities and schemas. A value of 'nameservice=ns1, schema=hcat1' means the function tests one nameservice and one HCatalog schema.
If you do not specify this argument, the function tests all authorities, nameservices, and schemas defined in HadoopImpersonationConfig .

Privileges

This function does not require privileges.

Examples

Consider the following definition of HadoopImpersonationConfig:

[{
        "nameservice": "ns1",
        "token": "RANDOM-TOKEN-STRING"
    },
    {
        "nameservice": "*",
        "doAs": "Paul"
    },
    {
        "schema": "hcat1",
        "doAs": "Fred"
    }
]

The following query tests only the "ns1" name service:

=> SELECT HADOOP_IMPERSONATION_CONFIG_CHECK('nameservice=ns1');

-- hadoop_impersonation_config_check --
Connections to nameservice [ns1] will use a delegation token with hash [b3dd9e71cd695d91]

This function returns a hash of the token for security reasons. You can call HASH_EXTERNAL_TOKEN with the expected value and compare that hash to the one in this function's output.

A query with no argument tests all values:

=> SELECT HADOOP_IMPERSONATION_CONFIG_CHECK();

-- hadoop_impersonation_config_check --
Connections to nameservice [ns1] will use a delegation token with hash [b3dd9e71cd695d91]
JDBC connections for HCatalog schema [hcat1] will doAs [Fred]
[!] hadoop_impersonation_config_check : [PASS]

15.5 - HASH_EXTERNAL_TOKEN

Returns a hash of a string token, for use with HADOOP_IMPERSONATION_CONFIG_CHECK.

Returns a hash of a string token, for use with HADOOP_IMPERSONATION_CONFIG_CHECK. Call HASH_EXTERNAL_TOKEN with the delegation token you expect Vertica to use and compare it to the hash in the output of HADOOP_IMPERSONATION_CONFIG_CHECK.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

HASH_EXTERNAL_TOKEN( 'token' )

Arguments

token: A string specifying the token to hash. The token is configured in the HadoopImpersonationConfig parameter.

Privileges

This function does not require privileges.

Examples

The following query tests the expected value shown in the example on the HADOOP_IMPERSONATION_CONFIG_CHECK reference page.

=> SELECT HASH_EXTERNAL_TOKEN('RANDOM-TOKEN-STRING');
hash_external_token
---------------------
b3dd9e71cd695d91
(1 row)

15.6 - HCATALOGCONNECTOR_CONFIG_CHECK

Tests the configuration of a Vertica cluster that uses the HCatalog Connector to access Hive data.

Tests the configuration of a Vertica cluster that uses the HCatalog Connector to access Hive data. The function first verifies that the HCatalog Connector is properly installed and reports on the values of several related configuration parameters. It then tests the connection using HiveServer2. This function does not support the WebHCat server.

If you specify an HCatalog schema, and if you have defined a delegation token for that schema, this function uses the delegation token. Otherwise, the function uses the default endpoint without a delegation token.

See Proxy users and delegation tokens for more about delegation tokens.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

HCATALOGCONNECTOR_CONFIG_CHECK( ['what_to_test' ] )

Arguments

what_to_test: A string specifying the HCatalog schemas to test. For example, a value of 'schema=hcat1' means the function tests only the "hcat1" schema and ignores any others that are found.

Privileges

This function does not require privileges.

Examples

The following query tests with the default endpoint and no delegation token.

=> SELECT HCATALOGCONNECTOR_CONFIG_CHECK();

-- hcatalogconnector_config_check --

    HCatalogConnectorUseHiveServer2 : [1]
    EnableHCatImpersonation : [1]
    HCatalogConnectorUseORCReader : [1]
    HCatalogConnectorUseParquetReader : [1]
    HCatalogConnectorUseTxtReader : [0]
  [INFO] Vertica is not configured to use its internal parsers for delimited files.
  [INFO] This is off by default, but will be changed in a future release.
    HCatalogConnectorUseLibHDFSPP : [1]

  [OK] HCatalog connector library is properly installed.
  [INFO] Creating JDBC connection as session user.
  [OK] Successful JDBC connection to HiveServer2 as user [USER].

  [!] hcatalogconnector_config_check : [PASS]

To test with the configured delegation token, pass the schema as an argument:

=> SELECT HCATALOGCONNECTOR_CONFIG_CHECK('schema=hcat1');

15.7 - HDFS_CLUSTER_CONFIG_CHECK

Tests the configuration of a Vertica cluster that uses HDFS.

Tests the configuration of a Vertica cluster that uses HDFS. The function scans the Hadoop configuration files found in HadoopConfDir and performs configuration checks on each cluster it finds. If you have more than one cluster configured, you can specify which one to test instead of testing all of them.

For each Hadoop cluster, it reports properties including:

Nameservice name and associated NameNodes
High-availability status
RPC encryption status
Kerberos authentication status
HTTP(S) status

It then tests connections using http(s), hdfs, and webhdfs URL schemes. It tests the latter two using both the Vertica and session user.

See Configuring HDFS access for information about configuration files and HadoopConfDir.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

HDFS_CLUSTER_CONFIG_CHECK( ['what_to_test' ] )

Arguments

what_to_test: A string specifying the authorities or nameservices to test. For example, a value of 'nameservice=ns1' means the function tests only "ns1" cluster. If you specify both an authority and a nameservice, the authority must be a NameNode in the specified nameservice for the check to pass.
If you do not specify this argument, the function tests all cluster configurations found in HadoopConfDir.

Privileges

This function does not require privileges.

Examples

The following example tests all clusters.

=> SELECT HDFS_CLUSTER_CONFIG_CHECK();

-- hdfs_cluster_config_check --

    Hadoop Conf Path : [/conf/hadoop_conf]
  [OK] HadoopConfDir verified on all nodes
    Connection Timeout (seconds) : [60]
    Token Refresh Frequency (seconds) : [0]
    HadoopFSBlockSizeBytes (MiB) : [64]

  [OK] Found [1] hadoop cluster configurations

------------- Cluster 1 -------------
    Is DefaultFS : [true]
    Nameservice : [vmns]
    Namenodes : [node1.example.com:8020, node2.example.com:8020]
    High Availability : [true]
    RPC Encryption : [false]
    Kerberos Authentication : [true]
    HTTPS Only : [false]
  [INFO] Checking connections to [hdfs:///]
    vertica : [OK]
    dbuser : [OK]

  [INFO] Checking connections to [http://node1.example.com:50070]
  [INFO] Node is in standby
  [INFO] Checking connections to [http://node2.example.com:50070]
  [OK] Can make authenticated external curl connection
  [INFO] Checking webhdfs
    vertica : [OK]
    USER : [OK]

  [!] hdfs_cluster_config_check : [PASS]

15.8 - KERBEROS_HDFS_CONFIG_CHECK

This function is deprecated and will be removed in a future release.

Deprecated

This function is deprecated and will be removed in a future release. Instead, use EXTERNAL_CONFIG_CHECK.

Tests the Kerberos configuration of a Vertica cluster that uses HDFS. The function succeeds if it can use both the Vertica keytab file and the session user to access HDFS, and reports errors otherwise. This function is a more specific version of KERBEROS_CONFIG_CHECK.

If the current session is not Kerberized, this function will not be able to use secured HDFS connections and will fail.

You can call this function with arguments to specify an HDFS configuration to test, or without arguments. If you call it with no arguments, this function reads the HDFS configuration files and fails if it does not find them. See Configuring HDFS access. If it finds configuration files, it tests all configured nameservices.

The function performs the following tests, in order:

Are Kerberos services available?
Does a keytab file exist and are the Kerberos and HDFS configuration parameters set in the database?
Can Vertica read and invoke kinit with the keys to authenticate to HDFS and obtain the database Kerberos ticket?
Can Vertica perform hdfs and webhdfs operations using both the database Kerberos ticket and user-forwardable tickets for the current session?
Can Vertica connect to HiveServer2? (This function does not support WebHCat.)

If any test fails, the function returns a descriptive error message.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

KERBEROS_HDFS_CONFIG_CHECK( ['hdfsHost:hdfsPort',
  'webhdfsHost:webhdfsPort', 'webhcatHost' ] )

Arguments

hdfsHost, hdfsPort: The hostname or IP address and port of the HDFS NameNode. Vertica uses this server to access data that is specified with hdfs URLs. If the value is ' ', the function skips this part of the check.
webhdfsHost, webhdfsPort: The hostname or IP address and port of the WebHDFS server. Vertica uses this server to access data that is specified with webhdfs URLs. If the value is ' ', the function skips this part of the check.
webhcatHost: Pass any value in this position. WebHCat is deprecated and this value is ignored but must be present.

Privileges

This function does not require privileges.

15.9 - SYNC_WITH_HCATALOG_SCHEMA

Copies the structure of a Hive database schema available through the HCatalog Connector to a Vertica schema.

Copies the structure of a Hive database schema available through the HCatalog Connector to a Vertica schema. If the HCatalog schema and the target Vertica schema have matching table names, SYNC_WITH_HCATALOG_SCHEMA overwrites the Vertica tables.

This function can synchronize the HCatalog schema directly. In this case, call it with the same schema name for the vertica_schema and hcatalog_schema parameters. The function can also synchronize a different schema to the HCatalog schema.

If you change the settings of HCatalog Connector configuration parameters, you must call this function again.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SYNC_WITH_HCATALOG_SCHEMA( vertica_schema, hcatalog_schema, [drop_non_existent] )

Parameters

vertica_schema: The target Vertica schema to store the copied HCatalog schema's metadata. This can be the same schema as hcatalog_schema, or it can be a separate one created with CREATE SCHEMA.

Caution
Do not use the Vertica schema to store other data.
hcatalog_schema: The HCatalog schema to copy, created with CREATE HCATALOG SCHEMA
drop_non_existent: If true, drop any tables in vertica_schema that do not correspond to a table in hcatalog_schema

Privileges

Non-superuser: CREATE privileges on vertica_schema.

Users also require access to Hive data, one of the following:

USAGE permissions on hcat_schema, if Hive does not use an authorization service to manage access.
Permission through an authorization service (Sentry or Ranger), and access to the underlying files in HDFS. (Sentry can provide that access through ACL synchronization.)
dbadmin user privileges, with or without an authorization service.

Data type matching

Hive STRING and BINARY data types are matched, in Vertica, to the VARCHAR(65000) and VARBINARY(65000) types. Adjust the data types with ALTER TABLE as needed after creating the schema. The maximum size of a VARCHAR or VARBINARY in Vertica is 65000, but you can use LONG VARCHAR and LONG VARBINARY to specify larger values.

Hive and Vertica define string length in different ways. In Hive the length is the number of characters; in Vertica it is the number of bytes. Thus, a character encoding that uses more than one byte, such as Unicode, can cause mismatches between the two. To avoid data truncation, set values in Vertica based on bytes, not characters.

If data size exceeds the column size, Vertica logs an event at read time in the QUERY_EVENTS system table.

Examples

The following example uses SYNC_WITH_HCATALOG_SCHEMA to synchronize an HCatalog schema named hcat:

=> CREATE HCATALOG SCHEMA hcat WITH hostname='hcathost' HCATALOG_SCHEMA='default'
   HCATALOG_USER='hcatuser';
CREATE SCHEMA
=> SELECT sync_with_hcatalog_schema('hcat', 'hcat');
sync_with_hcatalog_schema
----------------------------------------
Schema hcat synchronized with hcat
tables in hcat = 56
tables altered in hcat = 0
tables created in hcat = 56
stale tables in hcat = 0
table changes erred in hcat = 0
(1 row)

=> -- Use vsql's \d command to describe a table in the synced schema

=> \d hcat.messages
List of Fields by Tables
  Schema   |   Table  | Column  |      Type      | Size  | Default | Not Null | Primary Key | Foreign Key
-----------+----------+---------+----------------+-------+---------+----------+-------------+-------------
hcat       | messages | id      | int            |     8 |         | f        | f           |
hcat       | messages | userid  | varchar(65000) | 65000 |         | f        | f           |
hcat       | messages | "time"  | varchar(65000) | 65000 |         | f        | f           |
hcat       | messages | message | varchar(65000) | 65000 |         | f        | f           |
(4 rows)

The following example uses SYNC_WITH_HCATALOG_SCHEMA followed by ALTER TABLE to adjust a column value:

=> CREATE HCATALOG SCHEMA hcat WITH hostname='hcathost' HCATALOG_SCHEMA='default'
-> HCATALOG_USER='hcatuser';
CREATE SCHEMA
=> SELECT sync_with_hcatalog_schema('hcat', 'hcat');
...
=> ALTER TABLE hcat.t ALTER COLUMN a1 SET DATA TYPE long varchar(1000000);
=> ALTER TABLE hcat.t ALTER COLUMN a2 SET DATA TYPE long varbinary(1000000);

The following example uses SYNC_WITH_HCATALOG_SCHEMA with a local (non-HCatalog) schema:

=> CREATE HCATALOG SCHEMA hcat WITH hostname='hcathost' HCATALOG_SCHEMA='default'
-> HCATALOG_USER='hcatuser';
CREATE SCHEMA
=> CREATE SCHEMA hcat_local;
CREATE SCHEMA
=> SELECT sync_with_hcatalog_schema('hcat_local', 'hcat');

15.10 - SYNC_WITH_HCATALOG_SCHEMA_TABLE

Copies the structure of a single table in a Hive database schema available through the HCatalog Connector to a Vertica table.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SYNC_WITH_HCATALOG_SCHEMA_TABLE( vertica_schema, hcatalog_schema, table_name )

Parameters

vertica_schema: The existing Vertica schema to store the copied HCatalog schema's metadata. This can be the same schema as hcatalog_schema, or it can be a separate one created with CREATE SCHEMA.
hcatalog_schema: The HCatalog schema to copy, created with CREATE HCATALOG SCHEMA.
table_name: The table in hcatalog_schema to copy. If table_name already exists in vertica_schema, the function overwrites it.

Privileges

Non-superuser: CREATE privileges on vertica_schema.

Users also require access to Hive data, one of the following:

USAGE permissions on hcat_schema, if Hive does not use an authorization service to manage access.
Permission through an authorization service (Sentry or Ranger), and access to the underlying files in HDFS. (Sentry can provide that access through ACL synchronization.)
dbadmin user privileges, with or without an authorization service.

Data type matching

If data size exceeds the column size, Vertica logs an event at read time in the QUERY_EVENTS system table.

Examples

The following example uses SYNC_WITH_HCATALOG_SCHEMA_TABLE to synchronize the "nation" table:

=> CREATE SCHEMA 'hcat_local';
CREATE SCHEMA

=> CREATE HCATALOG SCHEMA hcat WITH hostname='hcathost' HCATALOG_SCHEMA='hcat'
   HCATALOG_USER='hcatuser';
CREATE SCHEMA

=> SELECT sync_with_hcatalog_schema_table('hcat_local', 'hcat', 'nation');
sync_with_hcatalog_schema_table
-----------------------------------------------------------------------------
    Schema hcat_local synchronized with hcat for table nation
    table nation is created in schema hcat_local
    (1 row)

The following example shows the behavior if the "nation" table already exists in the local schema:

=> SELECT sync_with_hcatalog_schema_table('hcat_local','hcat','nation');
sync_with_hcatalog_schema_table
-----------------------------------------------------------------------------
    Schema hcat_local synchronized with hcat for table nation
    table nation is altered in schema hcat_local
    (1 row)

15.11 - VERIFY_HADOOP_CONF_DIR

Verifies that the Hadoop configuration that is used to access HDFS is valid on all Vertica nodes.

Verifies that the Hadoop configuration that is used to access HDFS is valid on all Vertica nodes. The configuration is valid if:

all required configuration files are found on the path defined by the HadoopConfDir configuration parameter
all properties needed by Vertica are set in those files

This function does not attempt to validate the settings of those properties; it only verifies that they have values.

It is possible for Hadoop configuration to be valid on some nodes and invalid on others. The function reports a validation failure if the value is invalid on any node; the rest of the output reports the details.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

VERIFY_HADOOP_CONF_DIR( )

Parameters

This function has no parameters.

Privileges

This function does not require privileges.

Examples

The following example shows the results when the Hadoop configuration is valid.

=> SELECT VERIFY_HADOOP_CONF_DIR();
    verify_hadoop_conf_dir
-------------------------------------------------------------------
Validation Success
v_vmart_node0001: HadoopConfDir [PG_TESTOUT/config] is valid
v_vmart_node0002: HadoopConfDir [PG_TESTOUT/config] is valid
v_vmart_node0003: HadoopConfDir [PG_TESTOUT/config] is valid
v_vmart_node0004: HadoopConfDir [PG_TESTOUT/config] is valid
    (1 row)

In the following example, the Hadoop configuration is valid on one node, but on other nodes a needed value is missing.

=> SELECT VERIFY_HADOOP_CONF_DIR();
    verify_hadoop_conf_dir
-------------------------------------------------------------------
Validation Failure
v_vmart_node0001: HadoopConfDir [PG_TESTOUT/test_configs/config] is valid
v_vmart_node0002: No fs.defaultFS parameter found in config files in [PG_TESTOUT/config]
v_vmart_node0003: No fs.defaultFS parameter found in config files in [PG_TESTOUT/config]
v_vmart_node0004: No fs.defaultFS parameter found in config files in [PG_TESTOUT/config]
    (1 row)

16 - LDAP link functions

This section contains the functions associated with the Vertica LDAP Link service.

16.1 - LDAP_LINK_DRYRUN_CONNECT

Takes a set of LDAP Link connection parameters as arguments and begins a dry run connection between the LDAP server and Vertica.

By providing an empty string for the LDAPLinkBindPswd argument, you can also perform an anonymous bind if your LDAP server allows unauthenticated binds.

The dryrun and LDAP_LINK_SYNC_START functions must be run from the clerk node. To determine the clerk node, query NODE_RESOURCES:

=> SELECT node_name, dbclerk FROM node_resources WHERE dbclerk='t';
    node_name     | dbclerk
------------------+---------
 v_vmart_node0001 | t
(1 row)

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

LDAP_LINK_DRYRUN_CONNECT (
    'LDAPLinkURL',
    'LDAPLinkBindDN',
    'LDAPLinkBindPswd'
)

Privileges

Superuser

Examples

This tests the connection to an LDAP server at ldap://example.dc.com with the DN CN=amir,OU=QA,DC=dc,DC=com.

=> SELECT LDAP_LINK_DRYRUN_CONNECT('ldap://example.dc.com','CN=amir,OU=QA,DC=dc,DC=com','password');

                ldap_link_dryrun_connect
---------------------------------------------------------------------------------
Dry Run Connect Completed. Query v_monitor.ldap_link_dryrun_events for results.

To check the results of the bind, query the system table LDAP_LINK_DRYRUN_EVENTS.

=> SELECT event_timestamp, event_type, entry_name, role_name, link_scope, search_base from LDAP_LINK_DRYRUN_EVENTS;
        event_timestamp       |       event_type      |      entry_name      | link_scope | search_base
------------------------------+-----------------------+----------------------+------------+-------------
2019-12-09 15:41:43.589398-05 | BIND_STARTED          | -------------------- | ---------- | -----------
2019-12-09 15:41:43.590504-05 | BIND_FINISHED         | -------------------- | ---------- | -----------

16.2 - LDAP_LINK_DRYRUN_SEARCH

Takes a set of LDAP Link connection and search parameters as arguments and begins a dry run search for users and groups that would get imported from the LDAP server.

By providing an empty string for the LDAPLinkBindPswd argument, you can also perform an anonymous search if your LDAP server's Access Control List (ACL) is configured to allow unauthenticated searches. The settings for allowing anonymous binds are different from the ACL settings for allowing anonymous searches.

The dryrun and LDAP_LINK_SYNC_START functions must be run from the clerk node. To determine the clerk node, query NODE_RESOURCES:

=> SELECT node_name, dbclerk FROM node_resources WHERE dbclerk='t';
    node_name     | dbclerk
------------------+---------
 v_vmart_node0001 | t
(1 row)

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

LDAP_LINK_DRYRUN_SEARCH (
    'LDAPLinkURL',
    'LDAPLinkBindDN',
    'LDAPLinkBindPswd',
    'LDAPLinkSearchBase',
    'LDAPLinkScope',
    'LDAPLinkFilterUser',
    'LDAPLinkFilterGroup',
    'LDAPLinkUserName',
    'LDAPLinkGroupName',
    'LDAPLinkGroupMembers',
    [LDAPLinkSearchTimeout],
    ['LDAPLinkJoinAttr']
)

Privileges

Superuser

Examples

This searches for users and groups in the LDAP server. In this case, the LDAPLinkSearchBase parameter specifies the dc.com domain and a sub scope, which replicates the entire subtree under the DN.

To further filter results, the function checks for users and groups with the person and group objectClass attributes. It then searches the group attribute cn, identifying members of that group with the member attribute, and then identifying those individual users with the attribute uid.

=> SELECT LDAP_LINK_DRYRUN_SEARCH('ldap://example.dc.com','CN=amir,OU=QA,DC=dc,DC=com','$vertica$','dc=DC,dc=com','sub',
'(objectClass=person)','(objectClass=group)','uid','cn','member',10,'dn');

                ldap_link_dryrun_search
--------------------------------------------------------------------------------
Dry Run Search Completed. Query v_monitor.ldap_link_dryrun_events for results.

To check the results of the search, query the system table LDAP_LINK_DRYRUN_EVENTS.

=> SELECT event_timestamp, event_type, entry_name, ldapurihash, link_scope, search_base from LDAP_LINK_DRYRUN_EVENTS;
        event_timestamp          |    event_type    |       entry_name       | ldapurihash | link_scope | search_base
---------------------------------+------------------+------------------------+-------------+------------+--------------
2020-01-03 21:03:26.411753+05:30 | BIND_STARTED     | ---------------------- |           0 | sub        | dc=DC,dc=com
2020-01-03 21:03:26.422188+05:30 | BIND_FINISHED    | ---------------------- |           0 | sub        | dc=DC,dc=com
2020-01-03 21:03:26.422223+05:30 | SYNC_STARTED     | ---------------------- |           0 | sub        | dc=DC,dc=com
2020-01-03 21:03:26.422229+05:30 | SEARCH_STARTED   | **********             |           0 | sub        | dc=DC,dc=com
2020-01-03 21:03:32.043107+05:30 | LDAP_GROUP_FOUND | Account Operators      |           0 | sub        | dc=DC,dc=com
2020-01-03 21:03:32.04312+05:30  | LDAP_GROUP_FOUND | Administrators         |           0 | sub        | dc=DC,dc=com
2020-01-03 21:03:32.043182+05:30 | LDAP_USER_FOUND  | user1                  |           0 | sub        | dc=DC,dc=com
2020-01-03 21:03:32.043186+05:30 | LDAP_USER_FOUND  | user2                  |           0 | sub        | dc=DC,dc=com
2020-01-03 21:03:32.04319+05:30  | SEARCH_FINISHED  | **********             |           0 | sub        | dc=DC,dc=com

16.3 - LDAP_LINK_DRYRUN_SYNC

Takes a set of LDAP Link connection and search parameters as arguments and begins a dry run synchronization between the database and the LDAP server, which maps and synchronizes the LDAP server's users and groups with their equivalents in Vertica.

The dryrun and LDAP_LINK_SYNC_START functions must be run from the clerk node. To determine the clerk node, query NODE_RESOURCES:

=> SELECT node_name, dbclerk FROM node_resources WHERE dbclerk='t';
    node_name     | dbclerk
------------------+---------
 v_vmart_node0001 | t
(1 row)

You can view the results of the dry run in the system table LDAP_LINK_DRYRUN_EVENTS.

To cancel an in-progress synchronization, use LDAP_LINK_SYNC_CANCEL.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

LDAP_LINK_DRYRUN_SYNC (
    'LDAPLinkURL',
    'LDAPLinkBindDN',
    'LDAPLinkBindPswd',
    'LDAPLinkSearchBase',
    'LDAPLinkScope',
    'LDAPLinkFilterUser',
    'LDAPLinkFilterGroup',
    'LDAPLinkUserName',
    'LDAPLinkGroupName',
    'LDAPLinkGroupMembers',
    [LDAPLinkSearchTimeout],
    ['LDAPLinkJoinAttr']
)

Privileges

Superuser

Examples

To perform a dry run to map the users and groups returned from LDAP_LINK_DRYRUN_SEARCH, pass the same parameters as arguments to LDAP_LINK_DRYRUN_SYNC.

=> SELECT LDAP_LINK_DRYRUN_SYNC('ldap://example.dc.com','CN=amir,OU=QA,DC=dc,DC=com','$vertica$','dc=DC,dc=com','sub',
'(objectClass=person)','(objectClass=group)','uid','cn','member',10,'dn');

                          LDAP_LINK_DRYRUN_SYNC
------------------------------------------------------------------------------------------
Dry Run Connect and Sync Completed. Query v_monitor.ldap_link_dryrun_events for results.

To check the results of the sync, query the system table LDAP_LINK_DRYRUN_EVENTS.

=> SELECT event_timestamp, event_type, entry_name, ldapurihash, link_scope, search_base from LDAP_LINK_DRYRUN_EVENTS;
        event_timestamp          |     event_type      |       entry_name       | ldapurihash | link_scope | search_base
---------------------------------+---------------------+------------------------+-------------+------------+--------------
2020-01-03 21:08:30.883783+05:30 | BIND_STARTED        | ---------------------- |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:30.890574+05:30 | BIND_FINISHED       | ---------------------- |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:30.890602+05:30 | SYNC_STARTED        | ---------------------- |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:30.890605+05:30 | SEARCH_STARTED      | **********             |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939369+05:30 | LDAP_GROUP_FOUND    | Account Operators      |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939395+05:30 | LDAP_GROUP_FOUND    | Administrators         |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939461+05:30 | LDAP_USER_FOUND     | user1                  |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939463+05:30 | LDAP_USER_FOUND     | user2                  |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939468+05:30 | SEARCH_FINISHED     | **********             |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939718+05:30 | PROCESSING_STARTED  | **********             |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939887+05:30 | USER_CREATED        | user1                  |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939895+05:30 | USER_CREATED        | user2                  |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939949+05:30 | ROLE_CREATED        | Account Operators      |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939959+05:30 | ROLE_CREATED        | Administrators         |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.940603+05:30 | PROCESSING_FINISHED | **********             |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.940613+05:30 | SYNC_FINISHED       | ---------------------- |           0 | sub        | dc=DC,dc=com

16.4 - LDAP_LINK_SYNC_CANCEL

Cancels in-progress LDAP Link synchronizations (including those started by LDAP_LINK_DRYRUN_SYNC) between the LDAP server and Vertica.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

ldap_link_sync_cancel()

Privileges

Superuser

Examples

=> SELECT ldap_link_sync_cancel();

16.5 - LDAP_LINK_SYNC_START

Begins the synchronization between the LDAP server and Vertica immediately rather than waiting for the interval set in LDAPLinkInterval.

The dryrun and LDAP_LINK_SYNC_START functions must be run from the clerk node. To determine the clerk node, query NODE_RESOURCES:

=> SELECT node_name, dbclerk FROM node_resources WHERE dbclerk='t';
    node_name     | dbclerk
------------------+---------
 v_vmart_node0001 | t
(1 row)

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

ldap_link_sync_start()

Privileges

Superuser

Examples

=> SELECT ldap_link_sync_start();

17 - License management functions

This section contains function that monitor Vertica license status and compliance.

17.1 - AUDIT

Returns the raw data size (in bytes) of a database, schema, or table as it is counted in an audit of the database size.

Returns the raw data size (in bytes) of a database, schema, or table as it is counted in an audit of the database size. Unless you specify zero error tolerance and 100 percent confidence level, AUDIT returns only approximate results that can vary over multiple iterations.

AUDIT estimates the size for data in Vertica tables using the same data sampling method as Vertica uses, to determine if a database complies with the licensed database size allowance. Vertica does not use these results to determine whether the size of the database complies with the Vertica license's data allowance. For details, see Auditing database size.

For data stored in external tables based on ORC or Parquet format, AUDIT uses the total size of the data files. This value is never estimated—it is read from the file system storing the ORC or Parquet files (either the Vertica node's local file system, S3, or HDFS).

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

AUDIT('[[[database.]schema.]scope ]'[, 'granularity'] [, error-tolerance[, confidence-level]] )

Parameters

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

scope

Specifies the extent of the audit:

Empty string ('') audits the entire database.
The name of the schema or table to audit.

The schema or table to audit. To audit the database, set this parameter to an empty string.

granularity

The level at which the audit reports its results, one of the following strings:

database
schema
table

The level of granularity must be equal to or less than the granularity of scope. If you omit this parameter, granularity is set to the same level as scope. Thus, if online_sales is a schema, the following statements are identical:

AUDIT('online_sales', 'schema');
AUDIT('online_sales');

If AUDIT sets granularity to a level lower than the target object, it returns with a message that refers you to system table USER_AUDITS. For details, see Querying V_CATALOG.USER_AUDITS, below.

error-tolerance

Specifies the percentage margin of error allowed in the audit estimate. Enter the tolerance value as a decimal number, between 0 and 100. The default value is 5, for a 5% margin of error.

This argument has no effect on audits of external tables based on ORC or Parquet files. Audits of these tables always returns the actual size of the underlying data files.

Setting this value to 0 results in a full database audit, which is very resource intensive, as AUDIT analyzes the entire database. A full database audit significantly impacts performance, so Vertica does not recommend it for a production database.

Caution

Due to the iterative sampling that the auditing process uses, setting the error tolerance to a small fraction of a percent (for example, 0.00001) can cause AUDIT to run for a longer period than a full database audit. The lower you specify this value, the more resources the audit uses, as it performs more data sampling.

confidence-level

Specifies the statistical confidence level percentage of the estimate. Enter the confidence value as a decimal number, between 0 and 100. The default value is 99, indicating a confidence level of 99%.

This argument has no effect on audits of external tables based on ORC or Parquet files. Audits of these tables always returns the actual size of the underlying data files.

The higher the confidence value, the more resources the function uses, as it performs more data sampling. Setting this value to 100 results in a full audit of the database, which is very resource intensive, as the function analyzes all of the database. A full database audit significantly impacts performance, so Vertica does not recommend it for a production database.

Privileges

Superuser, or the following privileges:

SELECT privilege on the target tables
USAGE privilege on the target schemas

Note

If you audit a schema or the database, Vertica only returns the size of all objects that you have privileges to access within the audited object, as described above.

Querying V_CATALOG.USER_AUDITS

If AUDIT sets granularity to a level lower than the target object, it returns with a message that refers you to system table USER_AUDITS. To obtain audit data on objects of the specified granularity, query this table. For example, the following query seeks to audit all tables in the store schema:

=> SELECT AUDIT('store', 'table');
                           AUDIT
-----------------------------------------------------------
 See table sizes in v_catalog.user_audits for schema store
(1 row)

The next query queries USER_AUDITS and obtains the latest audits on those tables:


=> SELECT object_name, AVG(size_bytes)::int size_bytes, MAX(audit_start_timestamp::date) audit_start
      FROM user_audits WHERE object_schema='store'
      GROUP BY rollup(object_name) HAVING GROUPING_ID(object_name) < 1 ORDER BY GROUPING_ID();
    object_name    | size_bytes | audit_start
-------------------+------------+-------------
 store_dimension   |      22067 | 2017-10-26
 store_orders_fact |   27201312 | 2017-10-26
 store_sales_fact  |  301260170 | 2017-10-26
(3 rows)

Examples

See Auditing database size.

17.2 - AUDIT_FLEX

Returns the estimated ROS size of raw columns, equivalent to the export size of the flex data in the audited objects.

Returns the estimated ROS size of __raw__ columns, equivalent to the export size of the flex data in the audited objects. You can audit all flex data in the database, or narrow the audit scope to a specific flex table, projection, or schema. Vertica stores the audit results in system table USER_AUDITS.

The audit excludes the following:

Flex keys
Other columns in the audited tables.
Temporary flex tables

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

AUDIT_FLEX ('[scope]')

Parameters

scope

Specifies the extent of the audit:

Empty string ('') audits all flexible tables in the database.
The name of a schema, projection, or flex table.

Privileges

Superuser, or the following privileges:

SELECT privilege on the target tables
USAGE privilege on the target schemas

Note

If you audit a schema or the database, Vertica only returns the size of all objects that you have privileges to access within the audited object, as described above.

Examples

Audit all flex tables in the current database:

dbs=> select audit_flex('');
 audit_flex
------------
 8567679
(1 row)

Audit the flex tables in schema public:


dbs=> select audit_flex('public');
audit_flex
------------
8567679
(1 row)

Audit the flex data in projection bakery_b0:

dbs=> select audit_flex('bakery_b0');
 audit_flex
------------
 8566723
(1 row)

Audit flex table bakery:

dbs=> select audit_flex('bakery');
 audit_flex
------------
 8566723
(1 row)

To report the results of all audits saved in the USER_AUDITS, the following shows part of an extended display from the system table showing an audit run on a schema called test, and the entire database, dbs:

dbs=> \x
Expanded display is on.

dbs=> select * from user_audits;
-[ RECORD 1 ]-------------------------+------------------------------
size_bytes                            | 0
user_id                               | 45035996273704962
user_name                             | release
object_id                             | 45035996273736664
object_type                           | SCHEMA
object_schema                         |
object_name                           | test
audit_start_timestamp                 | 2014-02-04 14:52:15.126592-05
audit_end_timestamp                   | 2014-02-04 14:52:15.139475-05
confidence_level_percent              | 99
error_tolerance_percent               | 5
used_sampling                         | f
confidence_interval_lower_bound_bytes | 0
confidence_interval_upper_bound_bytes | 0
sample_count                          | 0
cell_count                            | 0
-[ RECORD 2 ]-------------------------+------------------------------
size_bytes                            | 38051
user_id                               | 45035996273704962
user_name                             | release
object_id                             | 45035996273704974
object_type                           | DATABASE
object_schema                         |
object_name                           | dbs
audit_start_timestamp                 | 2014-02-05 13:44:41.11926-05
audit_end_timestamp                   | 2014-02-05 13:44:41.227035-05
confidence_level_percent              | 99
error_tolerance_percent               | 5
used_sampling                         | f
confidence_interval_lower_bound_bytes | 38051
confidence_interval_upper_bound_bytes | 38051
sample_count                          | 0
cell_count                            | 0
-[ RECORD 3 ]-------------------------+------------------------------
...

17.3 - AUDIT_LICENSE_SIZE

Triggers an immediate audit of the database size to determine if it is in compliance with the raw data storage allowance included in your Vertica licenses.

If you use ORC or Parquet data stored in HDFS, results are only accurate if you run this function as a user who has access to all HDFS data. Either run the query with a principal that has read access to all such data, or use a Hadoop delegation token that grants this access. For more information about using delegation tokens, see Accessing kerberized HDFS data.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

AUDIT_LICENSE_SIZE()

Privileges

Superuser

Examples

=> SELECT audit_license_size();
 audit_license_size
--------------------
Raw Data Size: 0.00TB +/- 0.00TB
License Size : 10.00TB
Utilization  : 0%
Audit Time   : 2015-09-24 12:19:15.425486-04
Compliance Status : The database is in compliance with respect to raw data size.

License End Date: 2015-11-23 00:00:00 Days Remaining: 60.53
(1 row)

17.4 - AUDIT_LICENSE_TERM

Triggers an immediate audit to determine if the Vertica license has expired.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

AUDIT_LICENSE_TERM()

Privileges

Superuser

Examples

=> SELECT audit_license_term();
 audit_license_term
--------------------
Raw Data Size: 0.00TB +/- 0.00TB
License Size : 10.00TB
Utilization  : 0%
Audit Time   : 2015-09-24 12:19:15.425486-04
Compliance Status : The database is in compliance with respect to raw data size.

License End Date: 2015-11-23 00:00:00 Days Remaining: 60.53
(1 row)

17.5 - DISPLAY_LICENSE

Returns the terms of your Vertica license.

Returns the terms of your Vertica license. The information this function displays is:

The start and end dates for which the license is valid (or "Perpetual" if the license has no expiration).
The number of days you are allowed to use Vertica after your license term expires (the grace period)
The amount of data your database can store, if your license includes a data allowance.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DISPLAY_LICENSE()

Privileges

None

Examples

=> SELECT DISPLAY_LICENSE();
                  DISPLAY_LICENSE
---------------------------------------------------
 Vertica Systems, Inc.
2007-08-03
Perpetual
500GB

(1 row)

17.6 - GET_AUDIT_TIME

Reports the time when the automatic audit of database size occurs.

Reports the time when the automatic audit of database size occurs. Vertica performs this audit if your Vertica license includes a data size allowance. For details of this audit, see Managing licenses in the Administrator's Guide. To change the time the audit runs, use the SET_AUDIT_TIME function.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

GET_AUDIT_TIME()

Privileges

None

Examples

=> SELECT get_audit_time();
get_audit_time
-----------------------------------------------------
 The audit is scheduled to run at 11:59 PM each day.
(1 row)

17.7 - GET_COMPLIANCE_STATUS

Displays whether your database is in compliance with your Vertica license agreement.

Displays whether your database is in compliance with your Vertica license agreement. This information includes the results of Vertica's most recent audit of the database size (if your license has a data allowance as part of its terms), the license term (if your license has an end date), and the number of nodes (if your license has a node limit).

GET_COMPLIANCE_STATUS measures data allowance by TBs (where a TB equals 1024⁴ bytes).

The information displayed by GET_COMPLIANCE_STATUS includes:

The estimated size of the database (see Auditing database size for an explanation of the size estimate).
The raw data size allowed by your Vertica license.
The percentage of your allowance that your database is currently using.
The number of nodes and license limit.
The date and time of the last audit.
Whether your database complies with the data allowance terms of your license agreement.
The end date of your license.
How many days remain until your license expires.

Note

If your license does not have a data allowance, end date, or node limit, some of the values might not appear in the output for GET_COMPLIANCE_STATUS.

If the audit shows your license is not in compliance with your data allowance, you should either delete data to bring the size of the database under the licensed amount, or upgrade your license. If your license term has expired, you should contact Vertica immediately to renew your license. See Managing licenses for further details.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

GET_COMPLIANCE_STATUS()

Privileges

None

Examples

=> SELECT GET_COMPLIANCE_STATUS();
 get_compliance_status
--------------------
Raw Data Size: 0.00TB +/- 0.00TB
License Size : 10.00TB
Utilization  : 0%
Audit Time   : 2015-09-24 12:19:15.425486-04
Compliance Status : The database is in compliance with respect to raw data size.

License End Date: 2015-11-23 00:00:00 Days Remaining: 60.53
(1 row)

The following example shows output for a Vertica for SQL on Apache Hadoop cluster.

=> SELECT GET_COMPLIANCE_STATUS();
 get_compliance_status
--------------------
Node count : 4
License Node limit : 5
No size-compliance concerns for an Unlimited license

No expiration date for a Perpetual license
(1 row)

17.8 - SET_AUDIT_TIME

Sets the time that Vertica performs automatic database size audit to determine if the size of the database is compliant with the raw data allowance in your Vertica license.

Sets the time that Vertica performs automatic database size audit to determine if the size of the database is compliant with the raw data allowance in your Vertica license. Use this function if the audits are currently scheduled to occur during your database's peak activity time. This is normally not a concern, since the automatic audit has little impact on database performance.

Audits are scheduled by the preceding audit, so changing the audit time does not affect the next scheduled audit. For example, if your next audit is scheduled to take place at 11:59PM and you use SET_AUDIT_TIME to change the audit schedule 3AM, the previously scheduled 11:59PM audit still runs. As that audit finishes, it schedules the next audit to occur at 3AM.

Vertica always performs the next scheduled audit even where you have changed the audit time using SET_AUDIT_TIME and then triggered an automatic audit by issuing the statement, SELECT AUDIT_LICENSE_SIZE. Only after the next scheduled audit does Vertica begin auditing at the new time you set using SET_AUDIT_TIME. Thereafter, Vertica audits at the new time.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SET_AUDIT_TIME(time)

time: A string containing the time in 'HH:MM AM/PM' format (for example, '1:00 AM') when the audit should run daily.

Privileges

Superuser

Examples

=> SELECT SET_AUDIT_TIME('3:00 AM');
                            SET_AUDIT_TIME
-----------------------------------------------------------------------
 The scheduled audit time will be set to 3:00 AM after the next audit.
(1 row)

18 - Multiple active result sets functions

This section contains the functions associated with the Vertica library for Multiple Active Result Sets (MARS).

18.1 - CLOSE_ALL_RESULTSETS

Closes all result set sessions within Multiple Active Result Sets (MARS) and frees the MARS storage for other result sets.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SELECT CLOSE_ALL_RESULTSETS ('session_id')

Parameters

session_id: A string that specifies the Multiple Active Result Sets session.

Privileges

None; however, without superuser privileges, you can only close your own session's results.

Examples

This example shows how you can view a MARS result set, then close the result set, and then confirm that the result set has been closed.

Query the MARS storage table. One session ID is open and three result sets appear in the output.

=> SELECT * FROM SESSION_MARS_STORE;

    node_name     |            session_id             | user_name | resultset_id | row_count | remaining_row_count | bytes_used
------------------+-----------------------------------+-----------+--------------+-----------+---------------------+------------
 v_vmart_node0001 | server1.company.-83046:1y28gu9    | dbadmin   |            7 |    777460 |              776460 |   89692848
 v_vmart_node0001 | server1.company.-83046:1y28gu9    | dbadmin   |            8 |    324349 |              323349 |   81862010
 v_vmart_node0001 | server1.company.-83046:1y28gu9    | dbadmin   |            9 |    277947 |              276947 |   32978280
(1 row)

Close all result sets for session server1.company.-83046:1y28gu9:

=> SELECT CLOSE_ALL_RESULTSETS('server1.company.-83046:1y28gu9');
             close_all_resultsets
-------------------------------------------------------------
 Closing all result sets from server1.company.-83046:1y28gu9
(1 row)

Query the MARS storage table again for the current status. You can see that the session and result sets have been closed:

=> SELECT * FROM SESSION_MARS_STORE;

    node_name     |            session_id             | user_name | resultset_id | row_count | remaining_row_count | bytes_used
------------------+-----------------------------------+-----------+--------------+-----------+---------------------+------------
(0 rows)

18.2 - CLOSE_RESULTSET

Closes a specific result set within Multiple Active Result Sets (MARS) and frees the MARS storage for other result sets.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SELECT CLOSE_RESULTSET ('session_id', ResultSetID)

Parameters

session_id: A string that specifies the Multiple Active Result Sets session containing the ResultSetID to close.
ResultSetID: An integer that specifies which result set to close.

Privileges

None; however, without superuser privileges, you can only close your own session's results.

Examples

This example shows a MARS storage table opened. One session_id is currently open, and one result set appears in the output.

=> SELECT * FROM SESSION_MARS_STORE;
    node_name     |            session_id             | user_name | resultset_id | row_count | remaining_row_count | bytes_used
------------------+-----------------------------------+-----------+--------------+-----------+---------------------+------------
 v_vmart_node0001 | server1.company.-83046:1y28gu9    | dbadmin   |            1 |    318718 |              312718 |   80441904
(1 row)

Close user session server1.company.-83046:1y28gu9 and result set 1:

=> SELECT CLOSE_RESULTSET('server1.company.-83046:1y28gu9', 1);
            close_resultset
-------------------------------------------------------------
 Closing result set 1 from server1.company.-83046:1y28gu9
(1 row)

Query the MARS storage table again for current status. You can see that result set 1 is now closed:

SELECT * FROM SESSION_MARS_STORE;

    node_name     |            session_id             | user_name | resultset_id | row_count | remaining_row_count | bytes_used
------------------+-----------------------------------+-----------+--------------+-----------+---------------------+------------
(0 rows)

19 - Partition management functions

This section contains partition management functions specific to Vertica.

19.1 - CALENDAR_HIERARCHY_DAY

Specifies to group DATE partition keys into a hierarchy of years, months, and days.

Specifies to group DATE partition keys into a hierarchy of years, months, and days. The Vertica Tuple Mover regularly evaluates partition keys against the current date, and merges partitions as needed into the appropriate year and month partition groups.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CALENDAR_HIERARCHY_DAY( partition-expression[, active-months[, active-years] ] )

Parameters

partition-expression

The DATE expression on which to group partition keys, which must be identical to the table's PARTITION BY expression.

active-months

An integer ≥ 0 that specifies how many months preceding MONTH(CURRENT_DATE) to store unique partition keys in separate partitions.

If you specify 1, only partition keys of the current month are stored in separate partitions.

If you specify 0, all partition keys of the current month are merged into a partition group for that month.

For details, see Hierarchical partitioning.

Default: 2

active-years

An integer ≥ 0, specifies how many years preceding YEAR(CURRENT_DATE) to partition group keys by month in separate partitions.

If you specify 1, only partition keys of the current year are stored in month partition groups.

If you specify 0, all partition keys of the current and previous years are merged into year partition groups.

For details, see Hierarchical partitioning.

Default: 2

Important

The CALENDAR_HIERARCHY_DAY algorithm assumes that most table activity is focused on recent dates. Setting active-years and active-months to a low number ≥ 2 serves to isolate merge activity to date-specific containers, and incurs minimal overhead. Vertica recommends that you use the default setting of 2 for active-years and active-months. For most users, these settings achieve an optimal balance between ROS storage and performance.

Usage

Specify this function in a table partition clause, as its GROUP BY expression:

PARTITION BY partition-expression
  GROUP BY CALENDAR_HIERARCHY_DAY(
     `*`group-expression`*`
      [, active-months[, active-years] ] )

For example:

=> CREATE TABLE public.store_orders
(
    order_no int,
    order_date timestamp NOT NULL,
    shipper varchar(20),
    ship_date date
);
...
=> ALTER TABLE public.store_orders
      PARTITION BY order_date::DATE
      GROUP BY CALENDAR_HIERARCHY_DAY(order_date::DATE, 3, 2) REORGANIZE;

For details on usage, see Hierarchical partitioning.

19.2 - COPY_PARTITIONS_TO_TABLE

Copies partitions from one table to another.

Copies partitions from one table to another. This lightweight partition copy increases performance by initially sharing the same storage between two tables. After the copy operation is complete, the tables are independent of each other. Users can perform operations on one table without impacting the other. These operations can increase the overall storage required for both tables.

Note

Although they share storage space, Vertica considers the partitions as discrete objects for license capacity purposes. For example, copying a one TB partition would only consume one TB of space. Your Vertica license, however, considers them as separate objects consuming two TB of space.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

COPY_PARTITIONS_TO_TABLE (
    '[[database.]schema.]source-table',
    'min-range-value',
    'max-range-value',
    '[[database.]schema.]target-table'
     [, 'force-split']
)

Parameters

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

*source-table*

The source table of the partitions to copy.

min-range-value max-range-value

The minimum and maximum value of partition keys to copy, where min-range-value must be ≤ max-range-value. To copy one partition, min-range-value and max-range-value must be equal.

*target-table*

The target table of the partitions to copy. If the table does not exist, Vertica creates a table from the source table's definition, by calling CREATE TABLE with LIKE and INCLUDING PROJECTIONS clause. The new table inherits ownership from the source table. For details, see Replicating a table.

*force-split*

Optional Boolean argument, specifies whether to split ROS containers if the range of partition keys spans multiple containers or part of a single container:

true: Split ROS containers as needed.
false (default): Return with an error if ROS containers must be split to implement this operation.

Privileges

Non-superuser, one of the following:

Owner of source and target tables
TRUNCATE (if force-split is true) and SELECT on the source table, INSERT on the target table

If the target table does not exist, you must also have CREATE privileges on the target schema to enable table creation.

Table attribute requirements

The following attributes of both tables must be identical:

Column definitions, including NULL/NOT NULL constraints
Segmentation
Partition clause
Number of projections
Projection sort order
Primary and unique key constraints. However, the key constraints do not have to be identically enabled. For more information on constraints, see Constraints.

Note
If the target table has primary or unique key constraints enabled and copying or moving the partitions will insert duplicate key values into the target table, Vertica rolls back the operation.
Check constraints. For MOVE_PARTITIONS_TO_TABLE and COPY_PARTITIONS_TO_TABLE, Vertica enforces enabled check constraints on the target table only. For SWAP_PARTITIONS_BETWEEN_TABLES, Vertica enforces enabled check constraints on both tables. If there is a violation of an enabled check constraint, Vertica rolls back the operation.
Number and definitions of text indices.

Additionally, If access policies exist on the source table, the following must be true:

Access policies on both tables must be identical.
One of the following must be true:
- The executing user owns the source table.
- AccessPolicyManagementSuperuserOnly is set to true. See Managing access policies for details.

Table restrictions

The following restrictions apply to the source and target tables:

If the source and target partitions are in different storage tiers, Vertica returns a warning but the operation proceeds. The partitions remain in their existing storage tier.
The following tables cannot be used as sources or targets:
- Temporary tables
- Virtual tables
- System tables
- External tables

Examples

If you call COPY_PARTITIONS_TO_TABLE and the target table does not exist, the function creates the table automatically. In the following example, the target table partn_backup.tradfes_200801 does not exist. COPY_PARTITIONS_TO_TABLE creates the table and replicates the partition. Vertica also copies all the constraints associated with the source table except foreign key constraints.

=> SELECT COPY_PARTITIONS_TO_TABLE (
          'prod_trades',
          '200801',
          '200801',
          'partn_backup.trades_200801');
COPY_PARTITIONS_TO_TABLE
-------------------------------------------------
 1 distinct partition values copied at epoch 15.
(1 row)

19.3 - DROP_PARTITIONS

Drops the specified table partition keys.

Note

This function supersedes meta-function DROP_PARTITION, which was deprecated in Vertica 9.0.

Drops the specified table partition keys.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DROP_PARTITIONS (
    '[[database.]schema.]table-name',
    'min-range-value',
    'max-range-value'
    [, 'force-split']
)

Parameters

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

table-name

The target table. The table cannot be used as a dimension table in a pre-join projection and cannot have out-of-date (unrefreshed) projections.

min-range-value max-range-value

The minimum and maximum value of partition keys to drop, where min-range-value must be ≤ max-range-value. To drop one partition key, min-range-value and max-range-value must be equal.

force-split

Optional Boolean argument, specifies whether to split ROS containers if the range of partition keys spans multiple containers or part of a single container:

true: Split ROS containers as needed.
false (default): Return with an error if ROS containers must be split to implement this operation.

Note

In rare cases, DROP_PARTITIONS executes at the same time as a mergeout operation on the same ROS container. As a result, the function cannot split the container as specified and returns with an error. When this happens, call DROP_PARTITIONS again.

Privileges

One of the following:

DBADMIN
Table owner
USAGE privileges on the table schema and TRUNCATE privileges on the table

Examples

See Dropping partitions.

19.4 - DUMP_PROJECTION_PARTITION_KEYS

Dumps the partition keys of the specified projection.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DUMP_PROJECTION_PARTITION_KEYS( '[[database.]schema.]projection-name')

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
projection-name: Projection name

Privileges

Non-superuser: TRUNCATE on anchor table

Examples

The following statements create the table and projection online_sales.online_sales_fact and online_sales.online_sales_fact_rep, respectively, and partitions table data by the column call_center_key:

=> CREATE TABLE online_sales.online_sales_fact
(
    sale_date_key int NOT NULL,
    ship_date_key int NOT NULL,
    product_key int NOT NULL,
    product_version int NOT NULL,
    customer_key int NOT NULL,
    call_center_key int NOT NULL,
    online_page_key int NOT NULL,
    shipping_key int NOT NULL,
    warehouse_key int NOT NULL,
    promotion_key int NOT NULL,
    pos_transaction_number int NOT NULL,
    sales_quantity int,
    sales_dollar_amount float,
    ship_dollar_amount float,
    net_dollar_amount float,
    cost_dollar_amount float,
    gross_profit_dollar_amount float,
    transaction_type varchar(16)
)
PARTITION BY (online_sales_fact.call_center_key);

=> CREATE PROJECTION online_sales.online_sales_fact_rep AS SELECT * from online_sales.online_sales_fact unsegmented all nodes;

The following DUMP_PROJECTION_PARTITION_KEYS statement dumps the partition key from the projection online_sales.online_sales_fact_rep:

=> SELECT DUMP_PROJECTION_PARTITION_KEYS('online_sales.online_sales_fact_rep');

Partition keys on node v_vmart_node0001
  Projection 'online_sales_fact_rep'
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: 200
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: 199
   ...
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: 2
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: 1

 Partition keys on node v_vmart_node0002
  Projection 'online_sales_fact_rep'
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: 200
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: 199
...
(1 row)

19.5 - DUMP_TABLE_PARTITION_KEYS

Dumps the partition keys of all projections for the specified table.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DUMP_TABLE_PARTITION_KEYS ( '[[database.]schema.]table-name' )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
*table-name*: Name of the table

Privileges

Non-superuser: TRUNCATE on table

Examples

The following example creates a simple table called states and partitions the data by state:

=> CREATE TABLE states (year INTEGER NOT NULL,
       state VARCHAR NOT NULL)
       PARTITION BY state;
=> CREATE PROJECTION states_p (state, year) AS
       SELECT * FROM states
       ORDER BY state, year UNSEGMENTED ALL NODES;

Now dump the partition keys of all projections anchored on table states:

=> SELECT DUMP_TABLE_PARTITION_KEYS( 'states' );
      DUMP_TABLE_PARTITION_KEYS                                                               --------------------------------------------------------------------------------------------
 Partition keys on node v_vmart_node0001
  Projection 'states_p'
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: VT
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: PA
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: NY
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: MA

 Partition keys on node v_vmart_node0002
...
(1 row)

19.6 - MOVE_PARTITIONS_TO_TABLE

Moves partitions from one table to another.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

MOVE_PARTITIONS_TO_TABLE (
    '[[database.]schema.]source-table',
    'min-range-value',
    'max-range-value',
    '[[database.]schema.]target-table'
     [, force-split]
)

Parameters

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

source-table

The source table of the partitions to move.

min-range-value max-range-value

The minimum and maximum value of partition keys to move, where min-range-value must be ≤ max-range-value. To move one partition, min-range-value and max-range-value must be equal.

target-table

The target table of the partitions to move. If the table does not exist, Vertica creates a table from the source table's definition, by calling CREATE TABLE with LIKE and INCLUDING PROJECTIONS clause. The new table inherits ownership from the source table. For details, see Replicating a table.

force-split

Optional Boolean argument, specifies whether to split ROS containers if the range of partition keys spans multiple containers or part of a single container:

true: Split ROS containers as needed.
false (default): Return with an error if ROS containers must be split to implement this operation.

Privileges

Non-superuser, one of the following:

Owner of source and target tables
SELECT, TRUNCATE on the source table, INSERT on the target table

If the target table does not exist, you must also have CREATE privileges on the target schema to enable table creation.

Table attribute requirements

The following attributes of both tables must be identical:

Column definitions, including NULL/NOT NULL constraints
Segmentation
Partition clause
Number of projections
Projection sort order
Primary and unique key constraints. However, the key constraints do not have to be identically enabled. For more information on constraints, see Constraints.

Note
If the target table has primary or unique key constraints enabled and copying or moving the partitions will insert duplicate key values into the target table, Vertica rolls back the operation.
Check constraints. For MOVE_PARTITIONS_TO_TABLE and COPY_PARTITIONS_TO_TABLE, Vertica enforces enabled check constraints on the target table only. For SWAP_PARTITIONS_BETWEEN_TABLES, Vertica enforces enabled check constraints on both tables. If there is a violation of an enabled check constraint, Vertica rolls back the operation.
Number and definitions of text indices.

Additionally, If access policies exist on the source table, the following must be true:

Access policies on both tables must be identical.
One of the following must be true:
- The executing user owns the source table.
- AccessPolicyManagementSuperuserOnly is set to true. See Managing access policies for details.

Table restrictions

The following restrictions apply to the source and target tables:

If the source and target partitions are in different storage tiers, Vertica returns a warning but the operation proceeds. The partitions remain in their existing storage tier.
The following tables cannot be used as sources or targets:
- Temporary tables
- Virtual tables
- System tables
- External tables

Examples

See Archiving partitions.

19.7 - PARTITION_PROJECTION

Splits containers for a specified projection.

Splits ROS containers for a specified projection. PARTITION_PROJECTION also purges data while partitioning ROS containers if deletes were applied before the AHM epoch.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

PARTITION_PROJECTION ( '[[database.]schema.]projection')

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
projection: The projection to partition.

Privileges

Table owner
USAGE privilege on schema

Examples

In this example, PARTITION_PROJECTION forces a split of ROS containers on the states_p projection:

=> SELECT PARTITION_PROJECTION ('states_p');
  PARTITION_PROJECTION
------------------------
 Projection partitioned
(1 row)

19.8 - PARTITION_TABLE

Invokes the to reorganize ROS storage containers as needed to conform with the current partitioning policy.

Invokes the Tuple Mover to reorganize ROS storage containers as needed to conform with the current partitioning policy.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

PARTITION_TABLE ( '[schema.]table-name')

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table-name: The table to partition.

Privileges

Table owner
USAGE privilege on schema

Restrictions

You cannot run PARTITION_TABLE on a table that is an anchor table for a live aggregate projection or a Top-K projection.
To reorganize storage to conform to a new policy, run PARTITION_TABLE after changing the partition GROUP BY expression.

19.9 - PURGE_PARTITION

Purges a table partition of deleted rows.

Purges a table partition of deleted rows. Similar to PURGE and PURGE_PROJECTION, this function removes deleted data from physical storage so you can reuse the disk space. PURGE_PARTITION removes data only from the AHM epoch and earlier.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

PURGE_PARTITION ( '[[database.]schema.]table', partition-key )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table: The partitioned table to purge.
partition-key: The key of the partition to purge.

Privileges

Table owner
USAGE privilege on schema

Examples

The following example lists the count of deleted rows for each partition in a table, then calls PURGE_PARTITION() to purge the deleted rows from the data.

=> SELECT partition_key,table_schema,projection_name,sum(deleted_row_count)
   AS deleted_row_count FROM partitions
   GROUP BY partition_key,table_schema,projection_name
   ORDER BY partition_key;

 partition_key | table_schema | projection_name | deleted_row_count
---------------+--------------+-----------------+-------------------
 0             | public       | t_super         |                 2
 1             | public       | t_super         |                 2
 2             | public       | t_super         |                 2
 3             | public       | t_super         |                 2
 4             | public       | t_super         |                 2
 5             | public       | t_super         |                 2
 6             | public       | t_super         |                 2
 7             | public       | t_super         |                 2
 8             | public       | t_super         |                 2
 9             | public       | t_super         |                 1
(10 rows)
=> SELECT PURGE_PARTITION('t',5); -- Purge partition with key 5.
                            purge_partition
------------------------------------------------------------------------
 Task: merge partitions
(Table: public.t) (Projection: public.t_super)
(1 row)

=> SELECT partition_key,table_schema,projection_name,sum(deleted_row_count)
   AS deleted_row_count FROM partitions
   GROUP BY partition_key,table_schema,projection_name
   ORDER BY partition_key;


 partition_key | table_schema | projection_name | deleted_row_count
---------------+--------------+-----------------+-------------------
 0             | public       | t_super         |                 2
 1             | public       | t_super         |                 2
 2             | public       | t_super         |                 2
 3             | public       | t_super         |                 2
 4             | public       | t_super         |                 2
 5             | public       | t_super         |                 0
 6             | public       | t_super         |                 2
 7             | public       | t_super         |                 2
 8             | public       | t_super         |                 2
 9             | public       | t_super         |                 1
(10 rows)

19.10 - SWAP_PARTITIONS_BETWEEN_TABLES

Swaps partitions between two tables.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SWAP_PARTITIONS_BETWEEN_TABLES (
    '[[database.]schema.]staging-table',
    'min-range-value',
    'max-range-value',
    '[[database.]schema.]target-table'
     [, force-split]
)

Parameters

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

staging-table

The staging table from which to swap partitions.

min-range-value max-range-value

The minimum and maximum value of partition keys to swap, where min-range-value must be ≤ max-range-value. To swap one partition, min-range-value and max-range-value must be equal.

target-table

The table to which the partitions are to be swapped. The target table cannot be the same as the staging table.

force-split

Optional Boolean argument, specifies whether to split ROS containers if the range of partition keys spans multiple containers or part of a single container:

true: Split ROS containers as needed.
false (default): Return with an error if ROS containers must be split to implement this operation.

Privileges

Non-superuser, one of the following:

Owner of source and target tables
Target and source tables: TRUNCATE, INSERT, SELECT

Requirements

The following attributes of both tables must be identical:

Column definitions, including NULL/NOT NULL constraints
Segmentation
Partition clause
Number of projections
Projection sort order
Primary and unique key constraints. However, the key constraints do not have to be identically enabled. For more information on constraints, see Constraints.

Note
If the target table has primary or unique key constraints enabled and copying or moving the partitions will insert duplicate key values into the target table, Vertica rolls back the operation.
Check constraints. For MOVE_PARTITIONS_TO_TABLE and COPY_PARTITIONS_TO_TABLE, Vertica enforces enabled check constraints on the target table only. For SWAP_PARTITIONS_BETWEEN_TABLES, Vertica enforces enabled check constraints on both tables. If there is a violation of an enabled check constraint, Vertica rolls back the operation.
Number and definitions of text indices.

Additionally, If access policies exist on the source table, the following must be true:

Access policies on both tables must be identical.
One of the following must be true:
- The executing user owns the target table.
- AccessPolicyManagementSuperuserOnly is set to true.

Restrictions

The following restrictions apply to the source and target tables:

If the source and target partitions are in different storage tiers, Vertica returns a warning but the operation proceeds. The partitions remain in their existing storage tier.
The following tables cannot be used as sources or targets:
- Temporary tables
- Virtual tables
- System tables
- External tables

Examples

See Swapping partitions.

20 - Privileges and access functions

This section contains functions for managing user and role privileges, and access policies.

20.1 - ENABLED_ROLE

Checks whether a Vertica user role is enabled, and returns true or false.

Checks whether a Vertica user role is enabled, and returns true or false. This function is typically used when you create access policies on database roles.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

ENABLED_ROLE ( 'role' )

Parameters

role: The role to evaluate.

Privileges

None

Examples

See:

20.2 - GET_PRIVILEGES_DESCRIPTION

Returns the effective privileges the current user has on an object, including explicit, implicit, inherited, and role-based privileges.

Because this meta-function only returns effective privileges, GET_PRIVILEGES_DESCRIPTION only returns privileges with fully-satisfied prerequisites. For a list of prerequisites for common operations, see Privileges required for common database operations.

For example, a user must have the following privileges to query a table:

Schema: USAGE
Table: SELECT

If user Brooke has SELECT privileges on table s1.t1 but lacks USAGE privileges on schema s1, Brooke cannot query the table, and GET_PRIVILEGES_DESCRIPTION does not return SELECT as a privilege for the table.

Note

Inherited privileges are not displayed if privilege inheritance is disabled at the database level.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

GET_PRIVILEGES_DESCRIPTION( 'type', '[[database.]schema.]name' );

Parameters

type

Specifies an object type, one of the following:

database
table
schema
view
sequence
model
library
resource pool

[database.]schema

Specifies a database and schema, by default the current database and public, respectively.

name

Name of the target object

Privileges

None

Examples

In the following example, user Glenn has set the REPORTER role and wants to check his effective privileges on schema s1 and table s1.articles.

Table s1.articles inherits privileges from its schema (s1).
The REPORTER role has the following privileges:
- SELECT on schema s1
- INSERT WITH GRANT OPTION on table s1.articles
User Glenn has the following privileges:
- UPDATE and USAGE on schema s1.
- DELETE on table s1.articles.

GET_PRIVILEGES_DESCRIPTION returns the following effective privileges for Glenn on schema s1:

=> SELECT GET_PRIVILEGES_DESCRIPTION('schema', 's1');
   GET_PRIVILEGES_DESCRIPTION
--------------------------------
 SELECT, UPDATE, USAGE
(1 row)

GET_PRIVILEGES_DESCRIPTION returns the following effective privileges for Glenn on table s1.articles:


=> SELECT GET_PRIVILEGES_DESCRIPTION('table', 's1.articles');
   GET_PRIVILEGES_DESCRIPTION
--------------------------------
 INSERT*, SELECT, UPDATE, DELETE
(1 row)

20.3 - HAS_ROLE

Checks whether a Vertica user role is granted to the specified user or role, and returns true or false.

You can also query system tables ROLES, GRANTS, and USERS to obtain information on users and their role assignments. For details, see Viewing user roles.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

HAS_ROLE( [ 'grantee' ,] 'verify-role' );

Parameters

grantee: Valid only for superusers, specifies the name of a user or role to look up. If this argument is omitted, the function uses the current user name ( CURRENT_USER). If you specify a role, Vertica checks whether this role is granted to the role specified in verify-role.

Important
If a non-superuser supplies this argument, Vertica returns an error.
verify-role: Name of the role to verify for grantee.

Privileges

None

Examples

In the following example, a dbadmin user checks whether user MikeL is assigned the admnistrator role:

=> \c
You are now connected as user "dbadmin".
=> SELECT HAS_ROLE('MikeL', 'administrator');
 HAS_ROLE
----------
 t
(1 row)

User MikeL checks whether he has the regional_manager role:

=> \c - MikeL
You are now connected as user "MikeL".
=> SELECT HAS_ROLE('regional_manager');
 HAS_ROLE
----------
 f
(1 row)

The dbadmin grants the regional_manager role to the administrator role. On checking again, MikeL verifies that he now has the regional_manager role:

dbadmin=> \c
You are now connected as user "dbadmin".
dbadmin=> GRANT regional_manager to administrator;
GRANT ROLE
dbadmin=> \c - MikeL
You are now connected as user "MikeL".
dbadmin=> SELECT HAS_ROLE('regional_manager');
 HAS_ROLE
----------
 t
(1 row)

20.4 - RELEASE_SYSTEM_TABLES_ACCESS

Enables non-superuser access to all system tables.

Allows non-superusers to access all non-SUPERUSER_ONLY system tables. After you call this function, Vertica ignores the IS_ACCESSIBLE_DURING_LOCKDOWN setting in table SYSTEM_TABLES. To restrict non-superusers access to system tables, call RESTRICT_SYSTEM_TABLES_ACCESS.

By default, the database behaves as though RELEASE_SYSTEM_TABLES_ACCESS() was called. That is, non-superusers have access to all non-SUPERUSER_ONLY system tables.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

RELEASE_SYSTEM_TABLES_ACCESS()

Privileges

Superuser

Examples

By default, non-superuser Alice has access to client_auth and disk_storage. She also has access to replication_status because she was granted the privilege by the dbadmin:

=> SELECT table_name, is_superuser_only, is_accessible_during_lockdown FROM system_tables WHERE table_name='disk_storage' OR table_name='database_backups' OR table_name='replication_status' OR table_name='client_auth';
     table_name     | is_superuser_only | is_accessible_during_lockdown
--------------------+-------------------+-------------------------------
 client_auth        | f                 | t
 disk_storage       | f                 | f
 database_backups   | t                 | f
 replication_status | t                 | t
(4 rows)

The dbadmin calls RESTRICT_SYSTEM_TABLES_ACCESS:

=> SELECT RESTRICT_SYSTEM_TABLES_ACCESS();
                       RESTRICT_SYSTEM_TABLES_ACCESS
----------------------------------------------------------------------------
 Dropped grants to public on non-accessible during lockdown system tables.

(1 row)

Alice loses access to disk_storage, but she retains access to client_auth and replication_status because their IS_ACCESSIBLE_DURING_LOCKDOWN fields are true:

=> SELECT storage_status FROM disk_storage;
ERROR 4367:  Permission denied for relation disk_storage

The dbadmin calls RELEASE_SYSTEM_TABLES_ACCESS(), restoring Alice's access to disk_storage:

=> SELECT RELEASE_SYSTEM_TABLES_ACCESS();
              RELEASE_SYSTEM_TABLES_ACCESS
--------------------------------------------------------
 Granted SELECT privileges on system tables to public.

(1 row)

20.5 - RESTRICT_SYSTEM_TABLES_ACCESS

Checks system table SYSTEM_TABLES to determine which system tables non-superusers can access.

Prevents non-superusers from accessing tables that have the IS_ACCESSIBLE_DURING_LOCKDOWN flag set to false.

To enable non-superuser access to system tables restricted by this function, call RELEASE_SYSTEM_TABLES_ACCESS.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

RESTRICT_SYSTEM_TABLES_ACCESS()

Privileges

Superuser

Examples

By default, client_auth and disk_storage tables are accessible to all users, but only the former is accessible after RESTRICT_SYSTEM_TABLES_ACCESS() is called. Non-superusers never have access to database_backups and replication_status unless explicitly granted the privilege by the dbadmin:

=> SELECT table_name, is_superuser_only, is_accessible_during_lockdown FROM system_tables WHERE table_name='disk_storage' OR table_name='database_backups' OR table_name='replication_status' OR table_name='client_auth';
     table_name     | is_superuser_only | is_accessible_during_lockdown
--------------------+-------------------+-------------------------------
 client_auth        | f                 | t
 disk_storage       | f                 | f
 database_backups   | t                 | f
 replication_status | t                 | t
(4 rows)

The dbadmin then calls RESTRICT_SYSTEM_TABLES_ACCESS():

=> SELECT RESTRICT_SYSTEM_TABLES_ACCESS();
                       RESTRICT_SYSTEM_TABLES_ACCESS
----------------------------------------------------------------------------
 Dropped grants to public on non-accessible during lockdown system tables.

(1 row)

Bob loses access to disk_storage, but retains access to client_auth because its IS_ACCESSIBLE_DURING_LOCKDOWN field is true:

=> SELECT storage_status FROM disk_storage;
ERROR 4367:  Permission denied for relation disk_storage

=> SELECT auth_oid FROM client_auth;
     auth_oid
-------------------
 45035996273705106
 45035996273705110
 45035996273705114
(3 rows)

21 - Profiling functions

This section contains profiling functions specific to Vertica.

21.1 - CLEAR_PROFILING

Clears from memory data for the specified profiling type.

Note

Vertica stores profiled data in memory, so profiling can be memory intensive depending on how much data you collect.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CLEAR_PROFILING( 'profiling-type' [, 'scope'] )

Parameters

profiling-type

The type of profiling data to clear:

session: Clear profiling for basic session parameters and lock time out data.
query: Clear profiling for general information about queries that ran, such as the query strings used and the duration of queries.
ee: Clear profiling for information about the execution run of each query.

scope

Specifies at what scope to clear profiling on the specified data, one of the following:

local: Clear profiling data for the current session.
global: Clear profiling data across all database sessions.

Examples

The following statement clears profiled data for queries:

=> SELECT CLEAR_PROFILING('query');

21.2 - DISABLE_PROFILING

Disables for the current session collection of profiling data of the specified type.

Disables for the current session collection of profiling data of the specified type. For detailed information, see Enabling profiling.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DISABLE_PROFILING( 'profiling-type' )

Parameters

profiling-type

The type of profiling data to disable:

session: Disables profiling for basic session parameters and lock time out data.
query: Disables profiling for general information about queries that ran, such as the query strings used and the duration of queries.
ee: Disables profiling for information about the execution run of each query.

Examples

The following statement disables profiling on query execution runs:

=> SELECT DISABLE_PROFILING('ee');
   DISABLE_PROFILING
-----------------------
 EE Profiling Disabled
(1 row)

21.3 - ENABLE_PROFILING

Enables collection of profiling data of the specified type for the current session.

Enables collection of profiling data of the specified type for the current session. For detailed information, see Enabling profiling.

Note

Vertica stores session and query profiling data in memory, so profiling can be memory intensive, depending on how much data you collect.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

ENABLE_PROFILING( 'profiling-type' )

Parameters

profiling-type

The type of profiling data to enable:

session: Enable profiling for basic session parameters and lock time out data.
query: Enable profiling for general information about queries that ran, such as the query strings used and the duration of queries.
ee: Enable profiling for information about the execution run of each query.

Examples

The following statement enables profiling on query execution runs:

=> SELECT ENABLE_PROFILING('ee');
   ENABLE_PROFILING
----------------------
 EE Profiling Enabled
(1 row)

21.4 - SHOW_PROFILING_CONFIG

Shows whether profiling is enabled.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SHOW_PROFILING_CONFIG ()

Examples

The following statement shows that profiling is enabled globally for all profiling types (session, execution engine, and query):

=> SELECT SHOW_PROFILING_CONFIG();
SHOW_PROFILING_CONFIG
------------------------------------------
 Session Profiling: Session off, Global on
 EE Profiling:      Session off, Global on
 Query Profiling:   Session off, Global on
(1 row)

22 - Projection management functions

This section contains projection management functions specific to Vertica.

22.1 - CLEAR_PROJECTION_REFRESHES

Clears information projection refresh history from system table PROJECTION_REFRESHES.

System table PROJECTION_REFRESHES records information about refresh operations, successful and unsuccessful. PROJECTION_REFRESHES retains projection refresh data until one of the following events occurs:

Another refresh operation starts on a given projection.
CLEAR_PROJECTION_REFRESHES is called and clears data on all projections.
The table's storage quota is exceeded.

CLEAR_PROJECTION_REFRESHES checks PROJECTION_REFRESHES Boolean column IS_EXECUTING to determine whether refresh operations are still running or are complete. The function only removes information for refresh operations that are complete.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CLEAR_PROJECTION_REFRESHES()

Privileges

Superuser

Examples

=> SELECT CLEAR_PROJECTION_REFRESHES();
 CLEAR_PROJECTION_REFRESHES
----------------------------
 CLEAR
(1 row)

22.2 - EVALUATE_DELETE_PERFORMANCE

Evaluates projections for potential DELETE and UPDATE performance issues.

Evaluates projections for potential DELETE and UPDATE performance issues. If Vertica finds any issues, it issues a warning message. When evaluating multiple projections, EVALUATE_DELETE_PERFORMANCE returns up to ten projections with issues, and the name of a table that lists all issues that it found.

Note

EVALUATE_DELETE_PERFORMANCE returns messages that specifically reference delete performance. Keep in mind, however, that delete and update operations benefit equally from the same optimizations.

For information on resolving delete and update performance issues, see Optimizing DELETE and UPDATE.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

EVALUATE_DELETE_PERFORMANCE ( ['[[database.]schema.]scope'] )

Parameters

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

scope

Specifies the projections to evaluate, one of the following:

[table.]projection
Evaluate projection. For example:

SELECT EVALUATE_DELETE_PERFORMANCE('store.store_orders_fact.store_orders_fact_b1');

*table*
Specifies to evaluate all projections of table. For example:
```
SELECT EVALUATE_DELETE_PERFORMANCE('store.store_orders_fact');
```

If you supply no arguments, EVALUATE_DELETE_PERFORMANCE evaluates all projections that you can access. Depending on the size of your database, this can incur considerable overhead.

Privileges

Non-superuser: SELECT privilege on the anchor table

Examples

EVALUATE_DELETE_PERFORMANCE evaluates all projections of table example for potential DELETE and UPDATE performance issues.

=> create table example (A int, B int,C int);
CREATE TABLE
=> create projection one_sort (A,B,C) as (select A,B,C from example) order by A;
CREATE PROJECTION
=> create projection two_sort (A,B,C) as (select A,B,C from example) order by A,B;
CREATE PROJECTION
=> select evaluate_delete_performance('example');
            evaluate_delete_performance
---------------------------------------------------
 No projection delete performance concerns found.
(1 row)

The previous example show that the two projections one_sort and two_sort have no inherent structural issues that might cause poor DELETE performance. However, the data contained within the projection can create potential delete issues if the sorted columns do not uniquely identify a row or small number of rows.

In the following example, Perl is used to populate the table with data using a nested series of loops:

The inner loop populates column C.
The middle loop populates column B.
The outer loop populates column A.

The result is column A contains only three distinct values (0, 1, and 2), while column B slowly varies between 20 and 0 and column C changes in each row:

=> \! perl -e 'for ($i=0; $i<3; $i++) { for ($j=0; $j<21; $j++) { for ($k=0; $k<19; $k++) { printf "%d,%d,%d\n", $i,$j,$k;}}}' | /opt/vertica/bin/vsql -c "copy example from stdin delimiter ',' direct;"
Password:
=> select * from example;
 A | B  | C
---+----+----
 0 | 20 | 18
 0 | 20 | 17
 0 | 20 | 16
 0 | 20 | 15
 0 | 20 | 14
 0 | 20 | 13
 0 | 20 | 12
 0 | 20 | 11
 0 | 20 | 10
 0 | 20 |  9
 0 | 20 |  8
 0 | 20 |  7
 0 | 20 |  6
 0 | 20 |  5
 0 | 20 |  4
 0 | 20 |  3
 0 | 20 |  2
 0 | 20 |  1
 0 | 20 |  0
 0 | 19 | 18
 ...
 2 |  1 |  0
 2 |  0 | 18
 2 |  0 | 17
 2 |  0 | 16
 2 |  0 | 15
 2 |  0 | 14
 2 |  0 | 13
 2 |  0 | 12
 2 |  0 | 11
 2 |  0 | 10
 2 |  0 |  9
 2 |  0 |  8
 2 |  0 |  7
 2 |  0 |  6
 2 |  0 |  5
 2 |  0 |  4
 2 |  0 |  3
 2 |  0 |  2
 2 |  0 |  1
 2 |  0 |  0
=> SELECT COUNT (*) FROM example;
 COUNT
-------
  1197
(1 row)
=> SELECT COUNT (DISTINCT A) FROM example;
 COUNT
-------
     3
(1 row)

EVALUATE_DELETE_PERFORMANCE is run against the projections again to determine whether the data within the projections causes any potential DELETE performance issues. Projection one_sort has potential delete issues as it only sorts on column A which has few distinct values. Each value in the sort column corresponds to many rows in the projection, which can adversely impact DELETE performance. In contrast, projection two_sort is sorted on columns A and B, where each combination of values in the two sort columns identifies just a few rows, so deletes can be performed faster:


=> select evaluate_delete_performance('example');
            evaluate_delete_performance
---------------------------------------------------
 The following projections exhibit delete performance concerns:
        "public"."one_sort_b1"
        "public"."one_sort_b0"
See v_catalog.projection_delete_concerns for more details.

=> \x
Expanded display is on.
dbadmin=> select * from projection_delete_concerns;
-[ RECORD 1 ]------+------------------------------------------------------------------------------------------------------------------------------------------------------------
projection_id      | 45035996273878562
projection_schema  | public
projection_name    | one_sort_b1
creation_time      | 2019-06-17 13:59:03.777085-04
last_modified_time | 2019-06-17 14:00:27.702223-04
comment            | The squared number of rows matching each sort key is about 159201 on average.
-[ RECORD 2 ]------+------------------------------------------------------------------------------------------------------------------------------------------------------------
projection_id      | 45035996273878548
projection_schema  | public
projection_name    | one_sort_b0
creation_time      | 2019-06-17 13:59:03.777279-04
last_modified_time | 2019-06-17 13:59:03.777279-04
comment            | The squared number of rows matching each sort key is about 159201 on average.

If you omit supplying an argument to EVALUATE_DELETE_PERFORMANCE, it evaluates all projections that you can access:

=> select evaluate_delete_performance();
                          evaluate_delete_performance
---------------------------------------------------------------------------
 The following projections exhibit delete performance concerns:
        "public"."one_sort_b0"
        "public"."one_sort_b1"
See v_catalog.projection_delete_concerns for more details.
(1 row)

22.3 - GET_PROJECTION_SORT_ORDER

Returns the order of columns in a projection's ORDER BY clause.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

GET_PROJECTION_SORT_ORDER( '[[database.]schema.]projection' );

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
projection: The target projection.

Privileges

Non-superuser: SELECT privilege on the anchor table

Examples

=> SELECT get_projection_sort_order ('store_orders_super');
                                 get_projection_sort_order
--------------------------------------------------------------------------------------------
 public.store_orders_super [Sort Cols: "order_no", "order_date", "shipper", "ship_date"]

(1 row)

22.4 - GET_PROJECTION_STATUS

Returns information relevant to the status of a :.

Returns information relevant to the status of a projection:

The current K-safety status of the database
The number of nodes in the database
Whether the projection is segmented
The number and names of buddy projections
Whether the projection is safe
Whether the projection is up to date
Whether statistics have been computed for the projection

Use [GET_PROJECTION_STATUS](#) to monitor the progress of a projection data refresh.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

GET_PROJECTION_STATUS ( '[[database.]schema.]projection' );

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
projection: The projection for which to display status.

Examples

=> SELECT GET_PROJECTION_STATUS('public.customer_dimension_site01');
                                     GET_PROJECTION_STATUS
-----------------------------------------------------------------------------------------------
 Current system K is 1.
# of Nodes: 4.
public.customer_dimension_site01 [Segmented: No] [Seg Cols: ] [K: 3] [public.customer_dimension_site04, public.customer_dimension_site03,
public.customer_dimension_site02]
[Safe: Yes] [UptoDate: Yes][Stats: Yes]

22.5 - GET_PROJECTIONS

Returns contextual and projection information about projections of the specified anchor table.

Contextual information

Database K-safety
Number of database nodes
Number of projections for this table

Projection data

For each projection, specifies:

All buddy projections
Whether it is segmented
Whether it is safe
Whether it is up-to-date.

You can also use GET_PROJECTIONS to monitor the progress of a projection data refresh.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

GET_PROJECTIONS ( '[[database.]schema-name.]table' )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table: Anchor table of the projections to list.

Privileges

None

Examples

The following example gets information about projections for VMart table store.store_dimension:

=> SELECT GET_PROJECTIONS('store.store_dimension');
-[ RECORD 1 ]---+
GET_PROJECTIONS | Current system K is 1.
# of Nodes: 3.
Table store.store_dimension has 2 projections.

Projection Name: [Segmented] [Seg Cols] [# of Buddies] [Buddy Projections] [Safe] [UptoDate] [Stats]
----------------------------------------------------------------------------------------------------
store.store_dimension_b1 [Segmented: Yes] [Seg Cols: "store.store_dimension.store_key"] [K: 1] [store.store_dimension_b0] [Safe: Yes] [UptoDate: Yes] [Stats: RowCounts]
store.store_dimension_b0 [Segmented: Yes] [Seg Cols: "store.store_dimension.store_key"] [K: 1] [store.store_dimension_b1] [Safe: Yes] [UptoDate: Yes] [Stats: RowCounts]

22.6 - PURGE_PROJECTION

PURGE_PROJECTION can use significant disk space while purging the data.

Permanently removes deleted data from physical storage so disk space can be reused. You can purge historical data up to and including the Ancient History Mark epoch.

Caution

PURGE_PROJECTION can use significant disk space while purging the data.

See PURGE for details about purge operations.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

PURGE_PROJECTION ( '[[database.]schema.]projection' )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
projection: The projection to purge.

Privileges

Table owner
USAGE privilege on schema

Examples

The following example purges all historical data in projection tbl_p that precedes the Ancient History Mark epoch.

=> CREATE TABLE tbl (x int, y int);
CREATE TABLE
=> INSERT INTO tbl VALUES(1,2);
 OUTPUT
--------
      1
(1 row)

=> INSERT INTO tbl VALUES(3,4);
 OUTPUT
--------
      1
(1 row)

dbadmin=> COMMIT;
COMMIT
=> CREATE PROJECTION tbl_p AS SELECT x FROM tbl UNSEGMENTED ALL NODES;
WARNING 4468: Projection <public.tbl_p> is not available for query processing.
Execute the select start_refresh() function to copy data into this projection.
The projection must have a sufficient number of buddy projections and all nodes must be up before starting a refresh
CREATE PROJECTION
=> SELECT START_REFRESH();
             START_REFRESH
----------------------------------------
 Starting refresh background process.
=> DELETE FROM tbl WHERE x=1;
 OUTPUT
--------
      1
(1 row)

=> COMMIT;
COMMIT
=> SELECT MAKE_AHM_NOW();
         MAKE_AHM_NOW
-------------------------------
 AHM set (New AHM Epoch: 9066)
(1 row)

=> SELECT PURGE_PROJECTION ('tbl_p');
 PURGE_PROJECTION
-------------------
 Projection purged
(1 row)

22.7 - REFRESH

Synchronously refreshes one or more table projections in the foreground, and updates system table PROJECTION_REFRESHES.

Synchronously refreshes one or more table projections in the foreground, and updates system table PROJECTION_REFRESHES. If you run REFRESH with no arguments, it refreshes all projections that contain stale data.

To understand projection refresh in detail, go to Refreshing projections.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

REFRESH ( [ '[[database.]schema.]table-name[,...]' ] )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table-name: The anchor table of the projections to refresh. If you specify multiple tables, REFRESH attempts to refresh them in parallel. Such calls are part of the Database Designer deployment (and deployment script).

Returns

Note

If REFRESH does not refresh any projections, it returns a header string with no results.

Column...	Returns...
`Projection Name`	The projection targeted for refresh.
`Anchor Table`	The projection's associated anchor table.
`Status`	Projections' refresh status: `queued`: Queued for refresh. `refreshing`: Refresh is in process. `refreshed`: Refresh successfully completed. `failed`: Refresh did not successfully complete.
`Refresh Method`	Method used to refresh the projection.
`Error Count`	Number of times a refresh failed for the projection.
`Duration (sec)`	How long (in seconds) the projection refresh ran.

Privileges

Superuser
Owner of the specified tables

Refresh methods

Vertica can refresh a projection from one of its buddies, if one is available. In this case, the target projection gets the source buddy's historical data. Otherwise, the projection is refreshed from scratch with data of the latest epoch at the time of the refresh operation. In this case, the projection cannot participate in historical queries on any epoch that precedes the refresh operation.

To determine the method used to refresh a given projection, query REFRESH_METHOD from system table PROJECTION_REFRESHES.

Examples

The following example refreshes the projections in tables t1 and t2:

=> SELECT REFRESH('t1, t2');
                                             REFRESH
----------------------------------------------------------------------------------------
Refresh completed with the following outcomes:

Projection Name: [Anchor Table] [Status] [Refresh Method] [Error Count] [Duration (sec)]
----------------------------------------------------------------------------------------

"public"."t1_p": [t1] [refreshed] [scratch] [0] [0]"public"."t2_p": [t2] [refreshed] [scratch] [0] [0]

This next example shows that only the projection on table t was refreshed:

=> SELECT REFRESH('allow, public.deny, t');
                                               REFRESH
----------------------------------------------------------------------------------------

Refresh completed with the following outcomes:

Projection Name: [Anchor Table] [Status] [Refresh Method] [Error Count] [Duration (sec)]
----------------------------------------------------------------------------------------
"n/a"."n/a": [n/a] [failed: insufficient permissions on table "allow"] [] [1] [0]
"n/a"."n/a": [n/a] [failed: insufficient permissions on table "public.deny"] [] [1] [0]
"public"."t_p1": [t] [refreshed] [scratch] [0] [0]

22.8 - REFRESH_COLUMNS

Refreshes table columns that are defined with the constraint SET USING or DEFAULT USING.

Refreshes table columns that are defined with the constraint SET USING or DEFAULT USING. All refresh operations associated with a call to REFRESH_COLUMNS belong to the same transaction. Thus, all tables and columns specified by REFRESH_COLUMNS must be refreshed; otherwise, the entire operation is rolled back.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

REFRESH_COLUMNS ( 'table-list', '[column-list]'
   [, '[refresh-mode ]' [, min-partition-key, max-partition-key] ]
)

Parameters

table-list

A comma-delimited list of the tables to refresh:

[[database.]schema.]table[,...]

Important

If you specify multiple tables, parameter refresh-mode must be set to REBUILD.

column-list

A comma-delimited list of columns to refresh, specified as follows:

[[[database.]schema.]table.]column[,...]
[[database.]schema.]table.*

where asterisk (*) specifies to refresh all SET USING/DEFAULT USING columns in table. For example:

SELECT REFRESH_COLUMNS ('t1, t2', 't1.*, t2.b', 'REBUILD');

If column-list is set to an empty string (''), REFRESH_COLUMNS refreshes all SET USING/DEFAULT USING columns in the specified tables.

The following requirements apply:

All specified columns must have a SET USING or DEFAULT USING constraint.
If REFRESH_COLUMNS specifies multiple tables, all column names must be qualified by their table names. If the target tables span multiple schemas, all column names must be fully qualified by their schema and table names. For example:
```
SELECT REFRESH_COLUMNS ('t1, t2', 't1.a, t2.b', 'REBUILD');
```

If you specify a database, it must be the current database.

refresh-mode

Specifies how to refresh SET USING columns:

UPDATE : Marks original rows as deleted and replaces them with new rows. In order to save these updates, you must issue a COMMIT statement.
REBUILD: Replaces all data in the specified columns. The rebuild operation is auto-committed.

If set to an empty string or omitted, REFRESH_COLUMNS executes in UPDATE mode. If you specify multiple tables, you must explicitly specify REBUILD mode.

In both cases, REFRESH_COLUMNS returns an error if any SET USING column is defined as a primary or unique key in a table that enforces those constraints.

See REBUILD Mode Restrictions for limitations on using the REBUILD option.

min-partition-key max-partition-key

Qualifies REBUILD mode, limiting the rebuild operation to one or more partitions. To specify a range of partitions, max-partition-key must be greater than min-partition-key. To update one partition, the two arguments must be equal.

The following requirements apply:

The function can specify only one table to refresh.
The table must be partitioned on the specified keys.

You can use these arguments to refresh columns with recently loaded data—that is, data in the latest partitions. Using this option regularly can significantly minimize the overhead otherwise incurred by rebuilding entire columns in a large table.

See Partition-based REBUILD below for details.

Privileges

Schemas of queried and flattened tables: USAGE
Queried table: SELECT
Flattened table: SELECT, UPDATE

UPDATE versus REBUILD modes

In general, UPDATE mode is a better choice when changes to SET USING column data are confined to a relatively small number of rows. Use REBUILD mode when a significant amount of SET USING column data is stale and must be updated. It is generally good practice to call REFRESH_COLUMNS with REBUILD on any new SET USING column—for example, to populate a SET USING column after adding it with ALTER TABLE...ADD COLUMN.

REBUILD mode restrictions

If you call REFRESH_COLUMNS on a SET USING column and specify the refresh mode as REBUILD, Vertica returns an error if the column is specified in any of the following:

Table's partition key
Unsegmented projection
Projection with expressions, or any live aggregate projection that invokes a user-defined transform function (UDTF)
Sort order or segmentation of any projection
Any projection that omits an anchor table column that is referenced in the column's SET USING expression
GROUPED clause of any projection

Partition-based REBUILD operations

If a flattened table is partitioned, you can reduce the overhead of calling REFRESH_COLUMNS in REBUILD mode, by specifying one or more partition keys. Doing so limits the rebuild operation to the specified partitions. For example, table public.orderFact is defined with SET USING column cust_name. This table is partitioned on column order_date, where the partition clause invokes Vertica function CALENDAR_HIERARCHY_DAY. Thus, you can call REFRESH_COLUMNS on specific time-delimited partitions of this table—in this case, on orders over the last two months:

=> SELECT REFRESH_COLUMNS ('public.orderFact',
                        'cust_name',
                        'REBUILD',
                        TO_CHAR(ADD_MONTHS(current_date, -2),'YYYY-MM')||'-01',
                        TO_CHAR(LAST_DAY(ADD_MONTHS(current_date, -1))));
      REFRESH_COLUMNS
---------------------------
 refresh_columns completed
(1 row)

Rewriting SET USING queries

When you call REFRESH_COLUMNS on a flattened table's SET USING (or DEFAULT USING) column, it executes the SET USING query by joining the target and source tables. By default, the source table is always the inner table of the join. In most cases, cardinality of the source table is less than the target table, so REFRESH_COLUMNS executes the join efficiently.

Occasionally—notably, when you call REFRESH_COLUMNS on a partitioned table—the source table can be larger than the target table. In this case, performance of the join operation can be suboptimal.

You can address this issue by enabling configuration parameter RewriteQueryForLargeDim. When enabled (1), Vertica rewrites the query, by reversing the inner and outer join between the target and source tables.

Important

Enable this parameter only if the SET USING source data is in a table that is larger than the target table. If the source data is in a table smaller than the target table, then enabling RewriteQueryForLargeDim can adversely affect refresh performance.

Examples

See Flattened table example and DEFAULT versus SET USING.

22.9 - START_REFRESH

Refreshes projections in the current schema with the latest data of their respective.

Refreshes projections in the current schema with the latest data of their respective anchor tables. START_REFRESH runs asynchronously in the background, and updates system table PROJECTION_REFRESHES. This function has no effect if a refresh is already running.

To refresh only projections of a specific table, use REFRESH. When you deploy a design through Database Designer, it automatically refreshes its projections.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

START_REFRESH()

Privileges

None

Requirements

All nodes must be up.

Refresh methods

To determine the method used to refresh a given projection, query REFRESH_METHOD from system table PROJECTION_REFRESHES.

Examples

=> SELECT START_REFRESH();
             START_REFRESH
----------------------------------------
 Starting refresh background process.
(1 row)

23 - Session management functions

This section contains session management functions specific to Vertica.

See also the SQL system table V_MONITOR.SESSIONS.

23.1 - CANCEL_REFRESH

Cancels refresh-related internal operations initiated by START_REFRESH and REFRESH.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CANCEL_REFRESH()

Privileges

None

Notes

Refresh tasks run in a background thread in an internal session, so you cannot use INTERRUPT_STATEMENT to cancel those statements. Instead, use CANCEL_REFRESH to cancel statements that are run by refresh-related internal sessions.
Run CANCEL_REFRESH() on the same node on which START_REFRESH() was initiated.
CANCEL_REFRESH() cancels the refresh operation running on a node, waits for the cancelation to complete, and returns SUCCESS.
Only one set of refresh operations runs on a node at any time.

Examples

Cancel a refresh operation executing in the background.

=> SELECT START_REFRESH();
             START_REFRESH
----------------------------------------
Starting refresh background process.
(1 row)
=> SELECT CANCEL_REFRESH();
              CANCEL_REFRESH
----------------------------------------
Stopping background refresh process.
(1 row)

23.2 - CLOSE_ALL_SESSIONS

Closes all external sessions except the one that issues this function.

Closes all external sessions except the one that issues this function. Call this function before shutting down the Vertica database.

Vertica closes sessions asynchronously, so another session can open before this function returns. In this case, reissue this function. To view the status of all open sessions, query system table SESSIONS.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CLOSE_ALL_SESSIONS()

Privileges

Non-superuser: None to close your own session

Examples

Two user sessions are open on separate nodes:

=> SELECT * FROM sessions;
-[ RECORD 1 ]--------------+----------------------------------------------------
node_name                  | v_vmartdb_node0001
user_name                  | dbadmin
client_hostname            | 127.0.0.1:52110
client_pid                 | 4554
login_timestamp            | 2011-01-03 14:05:40.252625-05
session_id                 | stress04-4325:0x14
client_label               |
transaction_start          | 2011-01-03 14:05:44.325781
transaction_id             | 45035996273728326
transaction_description    | user dbadmin (select * from sessions;)
statement_start            | 2011-01-03 15:36:13.896288
statement_id               | 10
last_statement_duration_us | 14978
current_statement          | select * from sessions;
ssl_state                  | None
authentication_method      | Trust
-[ RECORD 2 ]--------------+----------------------------------------------------
node_name                  | v_vmartdb_node0002
user_name                  | dbadmin
client_hostname            | 127.0.0.1:57174
client_pid                 | 30117
login_timestamp            | 2011-01-03 15:33:00.842021-05
session_id                 | stress05-27944:0xc1a
client_label               |
transaction_start          | 2011-01-03 15:34:46.538102
transaction_id             | -1
transaction_description    | user dbadmin (COPY Mart_Fact FROM '/data/mart_Fact.tbl'
                             DELIMITER '|' NULL '\\n';)
statement_start            | 2011-01-03 15:34:46.538862
statement_id               |
last_statement_duration_us | 26250
current_statement          | COPY Mart_Fact FROM '/data/Mart_Fact.tbl' DELIMITER '|'
                             NULL '\\n';
ssl_state                  | None
authentication_method      | Trust
-[ RECORD 3 ]--------------+----------------------------------------------------
node_name                  | v_vmartdb_node0003
user_name                  | dbadmin
client_hostname            | 127.0.0.1:56367
client_pid                 | 1191
login_timestamp            | 2011-01-03 15:31:44.939302-05
session_id                 | stress06-25663:0xbec
client_label               |
transaction_start          | 2011-01-03 15:34:51.05939
transaction_id             | 54043195528458775
transaction_description    | user dbadmin (COPY Mart_Fact FROM '/data/Mart_Fact.tbl'
                             DELIMITER '|' NULL '\\n' DIRECT;)
statement_start            | 2011-01-03 15:35:46.436748
statement_id               |
last_statement_duration_us | 1591403
current_statement          | COPY Mart_Fact FROM '/data/Mart_Fact.tbl' DELIMITER '|'
                             NULL '\\n' DIRECT;
ssl_state                  | None
authentication_method      | Trust

Close all sessions:

=> \x
Expanded display is off.
=> SELECT CLOSE_ALL_SESSIONS();
                           CLOSE_ALL_SESSIONS
-------------------------------------------------------------------------
 Close all sessions command sent. Check v_monitor.sessions for progress.
(1 row)

Session contents after issuing CLOSE_ALL_SESSIONS:

=> SELECT * FROM SESSIONS;
-[ RECORD 1 ]--------------+----------------------------------------
node_name                  | v_vmartdb_node0001
user_name                  | dbadmin
client_hostname            | 127.0.0.1:52110
client_pid                 | 4554
login_timestamp            | 2011-01-03 14:05:40.252625-05
session_id                 | stress04-4325:0x14
client_label               |
transaction_start          | 2011-01-03 14:05:44.325781
transaction_id             | 45035996273728326
transaction_description    | user dbadmin (SELECT * FROM sessions;)
statement_start            | 2011-01-03 16:19:56.720071
statement_id               | 25
last_statement_duration_us | 15605
current_statement          | SELECT * FROM SESSIONS;
ssl_state                  | None
authentication_method      | Trust

23.3 - CLOSE_SESSION

Interrupts the specified external session, rolls back the current transaction if any, and closes the socket.

Interrupts the specified external session, rolls back the current transaction if any, and closes the socket. You can only close your own session.

It might take some time before a session is closed. To view the status of all open sessions, query the system table SESSIONS.

For detailed information about session management options, see Managing sessions.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CLOSE_SESSION ( 'sessionid' )

Parameters

sessionid: A string that specifies the session to close. This identifier is unique within the cluster at any point in time but can be reused when the session closes.

Privileges

None

Examples

User session opened. Record 2 shows the user session running a COPY DIRECT statement.

=> SELECT * FROM sessions;
-[ RECORD 1 ]--------------+-----------------------------------------------
node_name                  | v_vmartdb_node0001
user_name                  | dbadmin
client_hostname            | 127.0.0.1:52110
client_pid                 | 4554
login_timestamp            | 2011-01-03 14:05:40.252625-05
session_id                 | stress04-4325:0x14
client_label               |
transaction_start          | 2011-01-03 14:05:44.325781
transaction_id             | 45035996273728326
transaction_description    | user dbadmin (SELECT * FROM sessions;)
statement_start            | 2011-01-03 15:36:13.896288
statement_id               | 10
last_statement_duration_us | 14978
current_statement          | select * from sessions;
ssl_state                  | None
authentication_method      | Trust
-[ RECORD 2 ]--------------+-----------------------------------------------
node_name                  | v_vmartdb_node0002
user_name                  | dbadmin
client_hostname            | 127.0.0.1:57174
client_pid                 | 30117
login_timestamp            | 2011-01-03 15:33:00.842021-05
session_id                 | stress05-27944:0xc1a
client_label               |
transaction_start          | 2011-01-03 15:34:46.538102
transaction_id             | -1
transaction_description    | user dbadmin (COPY ClickStream_Fact FROM
                             '/data/clickstream/1g/ClickStream_Fact.tbl'
                             DELIMITER '|' NULL '\\n' DIRECT;)
statement_start            | 2011-01-03 15:34:46.538862
statement_id               |
last_statement_duration_us | 26250
current_statement          | COPY ClickStream_Fact FROM '/data/clickstream
                             /1g/ClickStream_Fact.tbl' DELIMITER '|' NULL
                             '\\n' DIRECT;
ssl_state                  | None
authentication_method      | Trust

Close user session stress05-27944:0xc1a

=> \x
Expanded display is off.
=> SELECT CLOSE_SESSION('stress05-27944:0xc1a');
                           CLOSE_SESSION
--------------------------------------------------------------------
 Session close command sent. Check v_monitor.sessions for progress.
(1 row)

Query the sessions table again for current status, and you can see that the second session has been closed:

=> SELECT * FROM SESSIONS;
-[ RECORD 1 ]--------------+--------------------------------------------
node_name                  | v_vmartdb_node0001
user_name                  | dbadmin
client_hostname            | 127.0.0.1:52110
client_pid                 | 4554
login_timestamp            | 2011-01-03 14:05:40.252625-05
session_id                 | stress04-4325:0x14
client_label               |
transaction_start          | 2011-01-03 14:05:44.325781
transaction_id             | 45035996273728326
transaction_description    | user dbadmin (select * from SESSIONS;)
statement_start            | 2011-01-03 16:12:07.841298
statement_id               | 20
last_statement_duration_us | 2099
current_statement          | SELECT * FROM SESSIONS;
ssl_state                  | None
authentication_method      | Trust

23.4 - CLOSE_USER_SESSIONS

Stops the session for a user, rolls back any transaction currently running, and closes the connection.

Stops the session for a user, rolls back any transaction currently running, and closes the connection. To determine the status of the sessions to close, query the SESSIONS table.

Note

Running this function on your own sessions leaves one session running.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CLOSE_USER_SESSIONS ( 'user-name' )

Parameters

user-name: Specifies the user whose sessions are to be closed. If you specify your own user name, Vertica closes all sessions except the one in which you issue this function.

Privileges

DBADMIN

Examples

This example closes all active session for user u1:

=> SELECT close_user_sessions('u1');

23.5 - GET_NUM_ACCEPTED_ROWS

Returns the number of rows loaded into the database for the last completed load for the current session.

Returns the number of rows loaded into the database for the last completed load for the current session. GET_NUM_ACCEPTED_ROWS is a meta-function. Do not use it as a value in an INSERT query.

The number of accepted rows is not available for a load that is currently in process. Check the LOAD_STREAMS system table for its status.

This meta-function supports loads from STDIN, COPY LOCAL from a Vertica client, or a single file on the initiator. You cannot use GET_NUM_ACCEPTED_ROWS for multi-node loads.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

GET_NUM_ACCEPTED_ROWS();

Privileges

None

Note

The data regarding accepted rows from the last load during the current session does not persist, and is lost when you initiate a new load.

Examples

This examples shows the number of accepted rows from the vmart_load_data.sql meta-command.

=> \i vmart_load_data.sql;
=> SELECT GET_NUM_ACCEPTED_ROWS ();
GET_NUM_ACCEPTED_ROWS
-----------------------
300000
(1 row)

23.6 - GET_NUM_REJECTED_ROWS

Returns the number of rows that were rejected during the last completed load for the current session.

Returns the number of rows that were rejected during the last completed load for the current session. GET_NUM_REJECTED_ROWS is a meta-function. Do not use it as a value in an INSERT query.

Rejected row information is unavailable for a load that is currently running. The number of rejected rows is not available for a load that is currently in process. Check the LOAD_STREAMS system table for its status.

This meta-function supports loads from STDIN, COPY LOCAL from a Vertica client, or a single file on the initiator. You cannot use GET_NUM_REJECTED_ROWS for multi-node loads.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

GET_NUM_REJECTED_ROWS();

Privileges

None

Note

The data regarding rejected rows from the last load during the current session does not persist, and is dropped when you initiate a new load.

Examples

This example shows the number of rejected rows from the vmart_load_data.sql meta-command.

=>  \i vmart_load_data.sql
=> SELECT GET_NUM_REJECTED_ROWS ();
GET_NUM_REJECTED_ROWS
-----------------------
0
(1 row)

23.7 - INTERRUPT_STATEMENT

Interrupts the specified statement in a user session, rolls back the current transaction, and writes a success or failure message to the log file.

Sessions can be interrupted during statement execution. Only statements run by user sessions can be interrupted.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

INTERRUPT_STATEMENT( '`*`session-id`*`', `*`statement-id`*` )

Parameters

session-id: Identifies the session to interrupt. This identifier is unique within the cluster at any point in time.
statement-id: Identifies the statement to interrupt. If the *statement-id* is valid, the statement can be interrupted and INTERRUPT_STATEMENT returns a success message. Otherwise the system returns an error.

Privileges

Superuser

Messages

The following list describes messages you might encounter:

Message	Meaning
`Statement interrupt sent. Check SESSIONS for progress.`	This message indicates success.
`Session <id> could not be successfully interrupted: session not found.`	The session ID argument to the interrupt command does not match a running session.
`Session <id> could not be successfully interrupted: statement not found.`	The statement ID does not match (or no longer matches) the ID of a running statement (if any).
`No interruptible statement running`	The statement is DDL or otherwise non-interruptible.
`Internal (system) sessions cannot be interrupted.`	The session is internal, and only statements run by external sessions can be interrupted.

Examples

Two user sessions are open. RECORD 1 shows user session running SELECT FROM SESSION, and RECORD 2 shows user session running COPY DIRECT:

=> SELECT * FROM SESSIONS;
-[ RECORD 1 ]--------------+----------------------------------------------------
node_name                  | v_vmartdb_node0001
user_name                  | dbadmin
client_hostname            | 127.0.0.1:52110
client_pid                 | 4554
login_timestamp            | 2011-01-03 14:05:40.252625-05
session_id                 | stress04-4325:0x14
client_label               |
transaction_start          | 2011-01-03 14:05:44.325781
transaction_id             | 45035996273728326
transaction_description    | user dbadmin (select * from sessions;)
statement_start            | 2011-01-03 15:36:13.896288
statement_id               | 10
last_statement_duration_us | 14978
current_statement          | select * from sessions;
ssl_state                  | None
authentication_method      | Trust
-[ RECORD 2 ]--------------+----------------------------------------------------
node_name                  | v_vmartdb_node0003
user_name                  | dbadmin
client_hostname            | 127.0.0.1:56367
client_pid                 | 1191
login_timestamp            | 2011-01-03 15:31:44.939302-05
session_id                 | stress06-25663:0xbec
client_label               |
transaction_start          | 2011-01-03 15:34:51.05939
transaction_id             | 54043195528458775
transaction_description    | user dbadmin (COPY Mart_Fact FROM '/data/Mart_Fact.tbl'
                             DELIMITER '|' NULL '\\n' DIRECT;)
statement_start            | 2011-01-03 15:35:46.436748
statement_id               | 5
last_statement_duration_us | 1591403
current_statement          | COPY Mart_Fact FROM '/data/Mart_Fact.tbl' DELIMITER '|'
                             NULL '\\n' DIRECT;
ssl_state                  | None
authentication_method      | Trust

Interrupt the COPY DIRECT statement running in session stress06-25663:0xbec:

=> \x
Expanded display is off.
=> SELECT INTERRUPT_STATEMENT('stress06-25663:0x1537', 5);
                       interrupt_statement
------------------------------------------------------------------
 Statement interrupt sent. Check v_monitor.sessions for progress.
(1 row)

Verify that the interrupted statement is no longer active by looking at the current_statement column in the SESSIONS system table. This column becomes blank when the statement is interrupted:

=> SELECT * FROM SESSIONS;
-[ RECORD 1 ]--------------+----------------------------------------------------
node_name                  | v_vmartdb_node0001
user_name                  | dbadmin
client_hostname            | 127.0.0.1:52110
client_pid                 | 4554
login_timestamp            | 2011-01-03 14:05:40.252625-05
session_id                 | stress04-4325:0x14
client_label               |
transaction_start          | 2011-01-03 14:05:44.325781
transaction_id             | 45035996273728326
transaction_description    | user dbadmin (select * from sessions;)
statement_start            | 2011-01-03 15:36:13.896288
statement_id               | 10
last_statement_duration_us | 14978
current_statement          | select * from sessions;
ssl_state                  | None
authentication_method      | Trust
-[ RECORD 2 ]--------------+----------------------------------------------------
node_name                  | v_vmartdb_node0003
user_name                  | dbadmin
client_hostname            | 127.0.0.1:56367
client_pid                 | 1191
login_timestamp            | 2011-01-03 15:31:44.939302-05
session_id                 | stress06-25663:0xbec
client_label               |
transaction_start          | 2011-01-03 15:34:51.05939
transaction_id             | 54043195528458775
transaction_description    | user dbadmin (COPY Mart_Fact FROM '/data/Mart_Fact.tbl'
                             DELIMITER '|' NULL '\\n' DIRECT;)
statement_start            | 2011-01-03 15:35:46.436748
statement_id               | 5
last_statement_duration_us | 1591403
current_statement          |
ssl_state                  | None
authentication_method      | Trust

23.8 - RELEASE_ALL_JVM_MEMORY

Forces all sessions to release the memory consumed by their Java Virtual Machines (JVM).

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

RELEASE_ALL_JVM_MEMORY();

Privileges

Must be a superuser.

Examples

The following example demonstrates viewing the JVM memory use in all open sessions, then calling RELEASE_ALL_JVM_MEMORY() to release the memory:

=> select user_name,external_memory_kb FROM V_MONITOR.SESSIONS;
 user_name | external_memory_kb
-----------+---------------
 dbadmin   |         79705
(1 row)

=> SELECT RELEASE_ALL_JVM_MEMORY();
                           RELEASE_ALL_JVM_MEMORY
-----------------------------------------------------------------------------
 Close all JVM sessions command sent. Check v_monitor.sessions for progress.
(1 row)

=> SELECT user_name,external_memory_kb FROM V_MONITOR.SESSIONS;
 user_name | external_memory_kb
-----------+---------------
 dbadmin   |             0
(1 row)

23.9 - RELEASE_JVM_MEMORY

Terminates a Java Virtual Machine (JVM), making available the memory the JVM was using.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

RELEASE_JVM_MEMORY();

Privileges

None.

Examples

User session opened. RECORD 2 shows the user session running COPY DIRECT statement.

=> SELECT RELEASE_JVM_MEMORY();
           release_jvm_memory
-----------------------------------------
Java process killed and memory released
(1 row)

23.10 - RESERVE_SESSION_RESOURCE

Reserves memory resources from the general resource pool for the exclusive use of the Vertica backup and restore process.

Reserves memory resources from the general resource pool for the exclusive use of the Vertica backup and restore process. No other Vertica process can access reserved resources. If insufficient resources are available, Vertica queues the reservation request.

This meta-function is a session level reservation. When a session ends Vertica automatically releases any resources reserved in that session. Because the meta-function operates at the session level, the resource name does not need to be unique across multiple sessions.

You can view reserved resources by querying the SESSIONS table.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

RESERVE_SESSION_RESOURCE ( 'name', memory)

Parameters

name: The name of the resource to reserve.
memory: The amount of memory in kilobytes to allocate to the resource.

Privileges

None

Examples

Reserve 1024 kilobytes of memory for the backup and restore process:

=> SELECT reserve_session_resource('VBR_RESERVE',1024);
   -[ RECORD 1 ]------------+----------------
   reserve_session_resource | Grant succeed

23.11 - RESET_SESSION

Applies your default connection string configuration settings to your current session.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

RESET_SESSION()

Examples

The following example shows how you use RESET_SESSION.

Resets the current client connection string to the default connection string settings:

=> SELECT RESET_SESSION();
    RESET_SESSION
----------------------
 Reset session: done.
(1 row)

24 - Statistics management functions

This section contains Vertica functions for collecting and managing table data statistics.

24.1 - ANALYZE_EXTERNAL_ROW_COUNT

Calculates the exact number of rows in an external table.

Calculates the exact number of rows in an external table. ANALYZE_EXTERNAL_ROW_COUNT runs in the background.

Note

You cannot calculate row counts on external tables with DO_TM_TASK.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

ANALYZE_EXTERNAL_ROW_COUNT ('[[[database.]schema.]table-name ]')

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table-name: Specifies the name of the external table for which to calculate the exact row count. If you supply an empty string, Vertica calculate the exact number of rows for all external tables.

Privileges

Any INSERT/UPDATE/DELETE privilege on the external table

Examples

Calculate the exact row count for all external tables:

=> SELECT ANALYZE_EXTERNAL_ROW_COUNT('');

Calculate the exact row count for table loader_rejects:

=> SELECT ANALYZE_EXTERNAL_ROW_COUNT('loader_rejects');

24.2 - ANALYZE_STATISTICS

Collects and aggregates data samples and storage information from all nodes that store projections associated with the specified table.

Collects and aggregates data samples and storage information from all nodes that store projections associated with the specified table.The function skips columns of complex data types. By default, Vertica analyzes multiple columns in a single-query execution plan, depending on resource limits. Such multi-column analysis facilitates the following objectives:

Reduce plan execution latency.
Speed up analysis of relatively small tables with many columns.

Vertica writes statistics to the database catalog. The query optimizer uses this collected data to create query plans. Without this data, the query optimizer assumes uniform distribution of data values and equal storage usage for all projections.

You can cancel statistics collection with CTRL+C or by calling [INTERRUPT_STATEMENT](/en/sql-reference/functions/meta-functions/session-management-functions/interrupt-statement/).

ANALYZE_STATISTICS is an alias of the function ANALYZE_HISTOGRAM, which is no longer documented.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

ANALYZE_STATISTICS ('[[[database.]schema.]table]' [, '`*`column-list`*`' [, percent ]]  )

Returns

0—Success

If an error occurs, refer to vertica.log for details.

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
*table*: Table on which to collect data. If set to an empty string, Vertica collects statistics for all database tables and their projections.
*column-list*: Comma-delimited list of columns in table, typically predicate columns. Vertica narrows the scope of the data collection to the specified columns. Columns of complex types are not supported.
If you alter a table to add a column and populate its contents with either default or other values, call ANALYZE_STATISTICS on this column to get the most current statistics.
*percent*: A float value between 0 and 100 that specifies what percentage of data to read from disk (not the amount of data to analyze). If you omit this argument, Vertica sets the percentage to 10.
Analyzing more than 10 percent disk space takes proportionally longer to process, but produces a higher level of sampling accuracy.

Privileges

Non-superuser:

Schema: USAGE
Table: One of INSERT, DELETE, or UPDATE

Restrictions

Vertica supports ANALYZE_STATISTICS on local and global temporary tables. In both cases, you can obtain statistics only on tables that are created with the option [ON COMMIT PRESERVE ROWS](/en/admin/working-with-native-tables/creating-temporary-tables/#Data). Otherwise, Vertica deletes table content on committing the current transaction, so no table data is available for analysis. Vertica collects no statistics from the following projections: * Live aggregate and Top-K projections * Projections that are defined to include an SQL function within an expression
Vertica collects no statistics on columns of ARRAY, SET, or ROW types.

Examples

See Collecting table statistics.

24.3 - ANALYZE_STATISTICS_PARTITION

Collects and aggregates data samples and storage information for a range of partitions in the specified table.

Collects and aggregates data samples and storage information for a range of partitions in the specified table. Vertica writes the collected statistics to the database catalog.

You can cancel statistics collection with CTRL+C or meta-function INTERRUPT_STATEMENT.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

ANALYZE_STATISTICS_PARTITION ('[[database.]schema.]table', 'min-range-value','max-range-value' [, 'column-list' [, percent ]] )

Returns

0: Success

If an error occurs, refer to vertica.log for details.

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table: Table on which to collect data.
min-range-value max-range-value: Minimum and maximum value of partition keys to analyze, where min-range-value must be ≤ max-range-value. To analyze one partition, min-range-value and max-range-value must be equal.
column-list: Comma-delimited list of columns in table, typically a predicate column. Vertica narrows the scope of the data collection to the specified columns.
percent: Float value between 0 and 100 that specifies what percentage of data to read from disk (not the amount of data to analyze). If you omit this argument, Vertica sets the percentage to 10.
Analyzing more than 10 percent disk space takes proportionally longer to process, but produces a higher level of sampling accuracy.

Privileges

Non-superuser:

Schema: USAGE
Table: One of INSERT, DELETE, or UPDATE

Requirements and restrictions

The following requirements and restrictions apply to ANALYZE_STATISTICS_PARTITION:

The table must be partitioned and cannot contain unpartitioned data.
The table partition expression must specify a single column. The following expressions are supported:
- Expressions that specify only the column—that is, partition on all column values. For example:
```
PARTITION BY ship_date GROUP BY CALENDAR_HIERARCHY_DAY(ship_date, 2, 2)
```
- If the column is a DATE or TIMESTAMP/TIMESTAMPTZ, the partition expression can specify a supported date/time function that returns that column or any portion of it, such as month or year. For example, the following partition expression specifies to partition on the year portion of column order_date:
```
PARTITION BY YEAR(order_date)
```
- Expressions that perform addition or subtraction on the column. For example:
```
PARTITION BY YEAR(order_date) -1
```
The table partition expression cannot coerce the specified column to another data type.
Vertica collects no statistics from the following projections:
- Live aggregate and Top-K projections
- Projections that are defined to include an SQL function within an expression

Examples

See Collecting partition statistics.

24.4 - DROP_EXTERNAL_ROW_COUNT

Removes external table row count statistics compiled by ANALYZE_EXTERNAL_ROW_COUNT.

Removes external table row count statistics compiled by ANALYZE_EXTERNAL_ROW_COUNT. DROP_EXTERNAL_ROW_COUNT runs in the background.

Caution

Statistics can be time consuming to regenerate.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DROP_EXTERNAL_ROW_COUNT ('[[[database.]schema.]table-name ]');

Parameters

schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table-name: The external table for which to remove the exact row count. If you specify an empty string, Vertica drops the exact row count statistic for all external tables.

Privileges

INSERT/UPDATE/DELETE privilege on table
USAGE privilege on schema that contains the table

Examples

Drop row count statistics for external table loader_rejects:

=> SELECT DROP_EXTERNAL_ROW_COUNT('loader_rejects');

24.5 - DROP_STATISTICS

Removes statistical data on database projections previously generated by ANALYZE_STATISTICS.

Removes statistical data on database projections previously generated by ANALYZE_STATISTICS. When you drop this data, the Vertica optimizer creates query plans using default statistics.

Caution

Regenerating statistics can incur significant overhead.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DROP_STATISTICS ('[[[database.]schema.]table]' [, 'category' [, '[column-list]'] )

Parameters

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

table

Table on which to drop statistics. If set to an empty string, Vertica drops statistics for all database tables and their projections.

category

Category of statistics to drop, one of the following:

ALL (default): Drop all statistics, including histograms and row counts.
HISTOGRAMS: Drop only histograms. Row count statistics remain.

column-list

Comma-delimited list of columns in table, typically predicate columns. Vertica narrows the scope of dropped statistics to the specified columns. If you omit this parameter or supply an empty string, Vertica drops statistics on all columns.

Privileges

Non-superuser:

Schema: USAGE
Table: One of INSERT, DELETE, or UPDATE

Examples

Drop all base statistics for the table store.store_sales_fact:

=> SELECT DROP_STATISTICS('store.store_sales_fact');
 DROP_STATISTICS
-----------------
               0
(1 row)

Drop statistics for all table projections:

=> SELECT DROP_STATISTICS ('');
 DROP_STATISTICS
-----------------
               0
(1 row)

24.6 - DROP_STATISTICS_PARTITION

Removes statistical data on database projections previously generated by ANALYZE_STATISTICS_PARTITION.

Removes statistical data on database projections previously generated by ANALYZE_STATISTICS_PARTITION. When you drop this data, the Vertica optimizer creates query plans using table-level statistics, if available, or default statistics.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DROP_STATISTICS_PARTITION ('[[database.]schema.]table', '[min-range-value]', '[max-range-value]' [, category [, '[column-list]'] )

Parameters

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

table

Table on which to drop statistics.

min-range-value max-range-value

The minimum and maximum value of partition keys on which to drop statistics, where min-range-value must be ≤ max-range-value. If you supply empty strings for both parameters, Vertica drops all partition-level statistics for this table or the specified columns.

Important

The range of keys to drop must be equal to, or a superset of, the full range of partitions previously analyzed by ANALYZE_STATISTICS_PARTITION. If the range omits any analyzed partition, DROP_STATISTICS_PARTITION drops no statistics.

category

The category of statistics to drop, one of the following:

BASE (default): Drop histograms and row counts (min/max column values, histogram).
HISTOGRAMS: Drop only histograms. Row count statistics remain.
ALL: Drop all statistics.

column-list

A comma-delimited list of columns in table, typically predicate columns. Vertica narrows the scope of dropped statistics to the specified columns. If you omit this parameter or supply an empty string, Vertica drops statistics on all columns.

Privileges

Non-superuser:

Schema: USAGE
Table: One of INSERT, DELETE, or UPDATE

24.7 - EXPORT_STATISTICS

Generates statistics in XML format from data previously collected by ANALYZE_STATISTICS.

Generates statistics in XML format from data previously collected by ANALYZE_STATISTICS. Before you export statistics, collect the latest data by calling ANALYZE_STATISTICS.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

EXPORT_STATISTICS ('[ filename ]' [,'table-spec' [,'column[,...]']])

Arguments

*filename*

Specifies where to write the generated XML. If filename already exists, EXPORT_STATISTICS overwrites it. If you supply an empty string, EXPORT_STATISTICS writes the XML to standard output.

table-spec

Specifies the table on which to export projection statistics:

  
[[database.]schema.]table

The default schema is public. If you specify a database, it must be the current database.

If table-spec is omitted or set to an empty string, Vertica exports all statistics for the database.

*column*

The name of a column in table-spec, typically a predicate column. You can specify multiple comma-delimited columns. Vertica narrows the scope of exported statistics to the specified columns.

Privileges

Superuser

Restrictions

EXPORT_STATISTICS does not export statistics for LONG data type columns.

Examples

The following statement exports statistics on the VMart example database to a file:

=> SELECT EXPORT_STATISTICS('/opt/vertica/examples/VMart_Schema/vmart_stats.xml');
        EXPORT_STATISTICS
-----------------------------------
Statistics exported successfully
(1 row)

The next statement exports statistics on a single column (price) from a table named food:

=> SELECT EXPORT_STATISTICS('/opt/vertica/examples/VMart_Schema/price.xml', 'food.price');
        EXPORT_STATISTICS
-----------------------------------
Statistics exported successfully
(1 row)

24.8 - EXPORT_STATISTICS_PARTITION

Generates partition-level statistics in XML format from data previously collected by ANALYZE_STATISTICS_PARTITION.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

EXPORT_STATISTICS_PARTITION ('[ filename ]', 'table-spec', 'min-range-value','max-range-value' [, 'column[,...]' )

Arguments

*filename*

Specifies where to write the generated XML. If filename already exists, EXPORT_STATISTICS_PARTITION overwrites it. If you supply an empty string, the function writes to standard output.

table-spec

Specifies the table on which to export partition statistics:

  
[[database.]schema.]table

The default schema is public. If you specify a database, it must be the current database.

min-range-value, max-range-value

The minimum and maximum value of partition keys on which to export statistics, where min-range-value must be ≤ max-range-value.

Important

The range of keys to export must be equal to, or a superset of, the full range of partitions previously analyzed by ANALYZE_STATISTICS_PARTITION. If the range omits any analyzed partition, EXPORT_STATISTICS_PARTITION exports no statistics.

*column*

The name of a column in table, typically a predicate column. You can specify multiple comma-delimited columns. Vertica narrows the scope of exported statistics to the specified columns.

Privileges

Superuser

Restrictions

EXPORT_STATISTICS_PARTITION does not export statistics for LONG data type columns.

24.9 - IMPORT_STATISTICS

Imports statistics from the XML file that was generated by EXPORT_STATISTICS.

Imports statistics from the XML file that was generated by EXPORT_STATISTICS. Imported statistics override existing statistics for the projections that are referenced in the XML file.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

IMPORT_STATISTICS ( 'filename' )

Parameters

filename: The path and name of an XML input file that was generated by EXPORT_STATISTICS.

Privileges

Superuser

Restrictions

IMPORT_STATISTICS imports only valid statistics. If the source XML file has invalid statistics for a specific column, those statistics are not imported and Vertica throws a warning. If the statistics file has an invalid structure, the import operation fails. To check a statistics file for validity, run VALIDATE_STATISTICS.
IMPORT_STATISTICS returns warnings for LONG data type columns, as the source XML file generated by EXPORT_STATISTICS contains no statistics for columns of that type.

Examples

Import the statistics for the VMart database from an XML file previously created by EXPORT_STATISTICS:

=> SELECT IMPORT_STATISTICS('/opt/vertica/examples/VMart_Schema/vmart_stats.xml');
                     IMPORT_STATISTICS
----------------------------------------------------------------------------
Importing statistics for projection date_dimension_super column date_key failure (stats did not contain row counts)
Importing statistics for projection date_dimension_super column date failure (stats did not contain row counts)
Importing statistics for projection date_dimension_super column full_date_description failure (stats did not contain row counts)
...
(1 row)

24.10 - VALIDATE_STATISTICS

Validates statistics in the XML file generated by EXPORT_STATISTICS.

Validates statistics in the XML file generated by EXPORT_STATISTICS.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

VALIDATE_STATISTICS ( 'XML-file' )

Parameters

XML-file: the path and name of the XML file that contains the statistics to validate.

Privileges

Superuser

Reporting valid statistics

The following example shows the results when the statistics are valid:

=> SELECT EXPORT_STATISTICS('cust_dim_stats.xml','customer_dimension');
    EXPORT_STATISTICS
-----------------------------------
 Statistics exported successfully
(1 row)

=> SELECT VALIDATE_STATISTICS('cust_dim_stats.xml');
 VALIDATE_STATISTICS
---------------------
(1 row)

Identifying invalid statistics

If VALIDATE_STATISTICS is unable to read a document's XML, it throws this error:

=> SELECT VALIDATE_STATISTICS('/home/dbadmin/stats.xml');
                       VALIDATE_STATISTICS
----------------------------------------------------------------------------
Error validating statistics file: At line 1:1. Invalid document structure
(1 row)

If some table statistics are invalid, VALIDATE_STATISTICS returns a report that identifies them. In the following example, the function reports that attributes distinct, buckets, rows, count, and distinctCount cannot be negative numbers.

=> SELECT VALIDATE_STATISTICS('/stats.xml');
WARNING 0:  Invalid value '-1' for attribute 'distinct' under column 'public.t.x'.
   Please use a positive value.
WARNING 0:  Invalid value '-1' for attribute 'buckets' under column 'public.t.x'.
   Please use a positive value.
WARNING 0:  Invalid value '-1' for attribute 'rows' under column 'public.t.x'.
   Please use a positive value.
WARNING 0:  Invalid value '-1' for attribute 'count' under bound '1', column 'public.t.x'.
   Please use a positive value.
WARNING 0:  Invalid value '-1' for attribute 'distinctCount' under bound '1', column 'public.t.x'.
   Please use a positive value.
 VALIDATE_STATISTICS
---------------------
 (1 row)

In this case, run ANALYZE_STATISTICS on the table again to create valid statistics.

25 - Storage management functions

This section contains storage management functions specific to Vertica.

25.1 - ALTER_LOCATION_LABEL

Adds a label to a storage location, or changes or removes an existing label.

Adds a label to a storage location, or changes or removes an existing label. You can change a location label if it is not specified by any storage policy.

Caution

If you label an existing storage location that already contains data, and then include the labeled location in one or more storage policies, existing data could be moved. If the Tuple Mover determines data stored on a labeled location does not comply with a storage policy, it moves the data elsewhere.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

ALTER_LOCATION_LABEL ( 'path' , '[node]' , '[location-label]' )

Parameters

path: The storage location path.
node: The node where the label change is applied. If you supply an empty string, Vertica applies the change across all cluster nodes.
location-label: The label to assign to the specified storage location. If you supply an empty string, Vertica removes that storage location's label.

Privileges

Superuser

Restrictions

You can remove a location label only if both of these conditions are true:

The label is not specified in the storage policy of a database object.
The labeled location is not the last available storage for the objects associated with it.

Examples

The following ALTER_LOCATION_LABEL statement applies across all cluster nodes the label SSD to the storage location /home/dbadmin/SSD/tables:

=> SELECT ALTER_LOCATION_LABEL('/home/dbadmin/SSD/tables','', 'SSD');
          ALTER_LOCATION_LABEL
---------------------------------------
 /home/dbadmin/SSD/tables label changed.
(1 row)

25.2 - ALTER_LOCATION_USE

Alters the type of data that a storage location holds.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

ALTER_LOCATION_USE ( 'path' , '[node]' , 'usage' )

Arguments

path

Where the storage location is mounted.

node

The Vertica node on which to alter the storage location. To alter the location on all cluster nodes in a single transaction, use an empty string (''). If the usage is SHARED TEMP or SHARED USER, you must alter it on all nodes.

usage

One of the following:

DATA: The storage location stores only data files.
TEMP: The location stores only temporary files that are created during loads or queries.
DATA,TEMP: The location can store both types of files.

Privileges

Superuser

Restrictions

You cannot change a storage location from a USER usage type if you created the location that way, or to a USER type if you did not. You can change a USER storage location to specify DATA (storing TEMP files is not supported). However, doing so does not affect the primary objective of a USER storage location, to be accessible by non-dbadmin users with assigned privileges.

You cannot change a storage location from SHARED TEMP or SHARED USER to SHARED DATA or the reverse.

Monitoring storage locations

For information about the disk storage used on each node, query the DISK_STORAGE system table.

Examples

The following example alters a storage location across all cluster nodes to store only data:

=> SELECT ALTER_LOCATION_USE ('/thirdSL/' , '' , 'DATA');

25.3 - CLEAR_CACHES

Clears the Vertica internal cache files.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CLEAR_CACHES ( )

Privileges

Superuser

Notes

If you want to run benchmark tests for your queries, in addition to clearing the internal Vertica cache files, clear the Linux file system cache. The kernel uses unallocated memory as a cache to hold clean disk blocks. If you are running version 2.6.16 or later of Linux and you have root access, you can clear the kernel file system cache as follows:

Make sure that all data in the cache is written to disk:
```
# sync
```
Writing to the drop_caches file causes the kernel to drop clean caches, entries, and inodes from memory, causing that memory to become free, as follows:
- To clear the page cache:
```
# echo 1 > /proc/sys/vm/drop_caches
```
- To clear the entries and inodes:
```
# echo 2 > /proc/sys/vm/drop_caches
```
- To clear the page cache, entries, and inodes:
```
# echo 3 > /proc/sys/vm/drop_caches
```

Examples

The following example clears the Vertica internal cache files:

=> SELECT CLEAR_CACHES();
 CLEAR_CACHES
--------------
 Cleared
(1 row)

25.4 - CLEAR_OBJECT_STORAGE_POLICY

Removes a user-defined storage policy from the specified database, schema or table.

Removes a user-defined storage policy from the specified database, schema or table. Storage containers at the previous policy's labeled location are moved to the default location. By default, this move occurs after all pending mergeout tasks return.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CLEAR_OBJECT_STORAGE_POLICY ( 'object-name' [,'key-min', 'key-max'] [, 'enforce-storage-move' ] )

Parameters

object-name

The object to clear, one of the following:

database: Clears database of its storage policy.
[database.]schema: Clears schema of its storage policy.
[[database.]schema.]table: Clears table of its storage policy. If table is in any schema other than public, you must supply the schema name.

In all cases, database must be the name of the current database.

key-min key-max

Valid only if object-name is a table, specifies the range of table partition key values stored at the labeled location.

enforce-storage-move

Specifies when the Tuple Mover moves all existing storage containers for the specified object to its default storage location:

false (default): Move storage containers only after all pending mergeout tasks return.
true: Immediately move all storage containers to the new location.

Tip

You can also enforce all storage policies immediately by calling Vertica meta-function ENFORCE_OBJECT_STORAGE_POLICY.

Privileges

Superuser

Examples

This following statement clears the storage policy for table store.store_orders_fact. The true argument specifies to implement the move immediately:

=> SELECT CLEAR_OBJECT_STORAGE_POLICY ('store.store_orders_fact', 'true');
                         CLEAR_OBJECT_STORAGE_POLICY
-----------------------------------------------------------------------------
 Object storage policy cleared.
Task: moving storages
(Table: store.store_orders_fact) (Projection: store.store_orders_fact_b0)
(Table: store.store_orders_fact) (Projection: store.store_orders_fact_b1)

(1 row)

25.5 - DROP_LOCATION

Permanently removes a retired storage location.

Permanently removes a retired storage location. This operation cannot be undone. You must first retire a storage location with RETIRE_LOCATION before dropping it; you cannot drop a storage location that is in use.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DROP_LOCATION ( 'path', 'node' )

Arguments

path: Where the storage location to drop is mounted.
node: The Vertica node on which to drop the location. To perform this operation on all nodes, use an empty string (''). If the storage location is SHARED, you must perform this operation on all nodes.

Privileges

Superuser

Storage locations with temp and data files

If you use a storage location to store data and then alter it to store only temp files, the location can still contain data files. Vertica does not let you drop a storage location containing data files. You can use the MOVE_RETIRED_LOCATION_DATA function to manually merge out the data files from the storage location, or you can drop partitions. Deleting data files does not work.

Examples

The following example shows how to drop a previously retired storage location on v_vmart_node0003:

=> SELECT DROP_LOCATION('/data', 'v_vmart_node0003');

25.6 - ENFORCE_OBJECT_STORAGE_POLICY

Applies storage policies of the specified object immediately.

Enterprise Mode only

Applies storage policies of the specified object immediately. By default, the Tuple Mover enforces object storage policies after all pending mergeout operations are complete. Calling this function is equivalent to setting the enforce argument when using RETIRE_LOCATION. You typically use this function as the last step before dropping a storage location.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

ENFORCE_OBJECT_STORAGE_POLICY ( 'object-name' [,'key-min', 'key-max'] )

Arguments

object-name

The database object whose storage policies are to be applied, one of the following:

database: Applies database storage policies.
[database.]schema: Applies schema storage policies.
[[database.]schema.]table: Applies table storage policies. If table is in any schema other than public, you must supply the schema name.

In all cases, database must be the name of the current database.

key-min, key-max

Valid only if object-name is a table, specifies the range of table partition key values on which to perform the move.

Privileges

One of the following:

Superuser
Object owner and access to its storage location.

Examples

Apply storage policy updates to the test table:

=> SELECT ENFORCE_OBJECT_STORAGE_POLICY ('test');

25.7 - MEASURE_LOCATION_PERFORMANCE

Measures a storage location's disk performance.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

MEASURE_LOCATION_PERFORMANCE ( 'path', 'node' )

Parameters

path: Specifies where the storage location to measure is mounted.
node: The Vertica node where the location to be measured is available. To obtain a list of all node names on the cluster, query system table DISK_STORAGE.

Privileges

Superuser

Notes

If you intend to create a tiered disk architecture in which projections, columns, and partitions are stored on different disks based on predicted or measured access patterns, you need to measure storage location performance for each location in which data is stored. You do not need to measure storage location performance for temp data storage locations because temporary files are stored based on available space.
The method of measuring storage location performance applies only to configured clusters. If you want to measure a disk before configuring a cluster see Measuring storage performance.
Storage location performance equates to the amount of time it takes to read and write 1MB of data from the disk. This time equates to:
```
IO-time = (time-to-read-write-1MB + time-to-seek) = (1/throughput + 1/latency)
```
Throughput is the average throughput of sequential reads/writes (units in MB per second).

Latency is for random reads only in seeks (units in seeks per second)

Note
The IO time of a faster storage location is less than a slower storage location.

Examples

The following example measures the performance of a storage location on v_vmartdb_node0004:

=> SELECT MEASURE_LOCATION_PERFORMANCE('/secondVerticaStorageLocation/' , 'v_vmartdb_node0004');
WARNING:  measure_location_performance can take a long time. Please check logs for progress
           measure_location_performance
--------------------------------------------------
 Throughput : 122 MB/sec. Latency : 140 seeks/sec

25.8 - MOVE_RETIRED_LOCATION_DATA

Moves all data from the specified retired storage location or from all retired storage locations in the database.

Moves all data from the specified retired storage location or from all retired storage locations in the database. MOVE_RETIRED_LOCATION_DATA migrates the data to non-retired storage locations according to the storage policies of the objects whose data is stored in the location. This function returns only after it completes migration of all affected storage location data.

Note

The Tuple Mover migrates data of retired storage locations when it consolidates data into larger ROS containers.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

MOVE_RETIRED_LOCATION_DATA( ['location-path'] [, 'node'] )

Arguments

location-path: The path of the storage location as specified in the LOCATION_PATH column of system table STORAGE_LOCATIONS. This storage location must be marked as retired.
If you omit this argument, MOVE_RETIRED_LOCATION_DATA moves data from all retired storage locations.
node: The node on which to move data of the retired storage location. If location-path is undefined on node, this function returns an error.
If you omit this argument, MOVE_RETIRED_LOCATION_DATA moves data from*location-path* on all nodes.

Privileges

Superuser

Examples

Query system table STORAGE_LOCATIONS to show which storage locations are retired:

=> SELECT node_name, location_path, location_label, is_retired FROM STORAGE_LOCATIONS
   WHERE is_retired = 't';
    node_name     |    location_path     | location_label | is_retired
------------------+----------------------+----------------+------------
 v_vmart_node0001 | /home/dbadmin/SSDLoc | ssd            | t
 v_vmart_node0002 | /home/dbadmin/SSDLoc | ssd            | t
 v_vmart_node0003 | /home/dbadmin/SSDLoc | ssd            | t
(3 rows)

Query system table STORAGE_LOCATIONS for the location of the messages table, which is currently stored in retired storage location ssd:

=> SELECT node_name, total_row_count, location_label FROM STORAGE_CONTAINERS
   WHERE projection_name ILIKE 'messages%';
    node_name     | total_row_count | location_label
------------------+-----------------+----------------
 v_vmart_node0001 |          333514 | ssd
 v_vmart_node0001 |          333255 | ssd
 v_vmart_node0002 |          333255 | ssd
 v_vmart_node0002 |          333231 | ssd
 v_vmart_node0003 |          333231 | ssd
 v_vmart_node0003 |          333514 | ssd
(6 rows)

Call MOVE_RETIRED_LOCATION_DATA to move the data off the ssd storage location.

=> SELECT MOVE_RETIRED_LOCATION_DATA('/home/dbadmin/SSDLoc');
          MOVE_RETIRED_LOCATION_DATA
-----------------------------------------------
 Move data off retired storage locations done

(1 row)

Repeat the previous query to verify the storage location of the messages table:


=> SELECT node_name, total_row_count, storage_type, location_label FROM storage_containers
   WHERE projection_name ILIKE 'messages%';
    node_name     | total_row_count | location_label
------------------+-----------------+----------------
 v_vmart_node0001 |          333255 | base
 v_vmart_node0001 |          333514 | base
 v_vmart_node0003 |          333514 | base
 v_vmart_node0003 |          333231 | base
 v_vmart_node0002 |          333231 | base
 v_vmart_node0002 |          333255 | base
(6 rows)

25.9 - RESTORE_LOCATION

Restores a storage location that was previously retired with RETIRE_LOCATION.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

RESTORE_LOCATION ( 'path', 'node' )

Arguments

path: Where to mount the retired storage location.
node: The Vertica node on which to restore the location. To perform this operation on all nodes, use an empty string (''). If the storage location is SHARED, you must perform this operation on all nodes.
The operation fails if you dropped any locations.

Privileges

Superuser

Effects of restoring a previously retired location

After restoring a storage location, Vertica re-ranks all of the cluster storage locations. It uses the newly restored location to process queries as determined by its rank.

Monitoring storage locations

For information about the disk storage used on each node, query the DISK_STORAGE system table.

Examples

Restore a retired storage location on node4:

=> SELECT RESTORE_LOCATION ('/thirdSL/' , 'v_vmartdb_node0004');

25.10 - RETIRE_LOCATION

Deactivates the specified storage location.

Deactivates the specified storage location. To obtain a list of all existing storage locations, query the STORAGE_LOCATIONS system table.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

RETIRE_LOCATION ( 'path', 'node' [, enforce ] )

Arguments

path: Where the storage location to retire is mounted.
node: The Vertica node on which to retire the location. To perform this operation on all nodes, use an empty string (''). If the storage location is SHARED, you must perform this operation on all nodes.
enforce: If true, the location label is set to an empty string and the data is moved elsewhere. The location can then be dropped without errors or warnings. Use this argument to expedite dropping a location.

Privileges

Superuser

Effects of retiring a storage location

RETIRE_LOCATION checks that the location is not the only storage for data and temp files. At least one location must exist on each node to store data and temp files. However, you can store both sorts of files in either the same location or separate locations.

If a location is the last available storage for its associated objects, you can retire it only if you set enforce to true.

When you retire a storage location:

No new data is stored at the retired location, unless you first restore it using RESTORE_LOCATION.
By default, if the storage location being retired contains stored data, the data is not moved. Thus, you cannot drop the storage location. Instead, Vertica removes the stored data through one or more mergeouts. To drop the location immediately after retiring it, set enforce to true.
If the storage location being retired is used only for temp files or you use enforce, you can drop the location. See Dropping storage locations and DROP_LOCATION.

Monitoring storage locations

For information about the disk storage used on each node, query the DISK_STORAGE system table.

Examples

The following examples show two approaches to retiring a storage location.

You can retire a storage location and its data will be moved out automatically at a future time:

=> SELECT RETIRE_LOCATION ('/data' , 'v_vmartdb_node0004');

You can specify that data in the storage location be moved immediately, so that you can then drop the location without waiting:

=> SELECT RETIRE_LOCATION ('/data' , 'v_vmartdb_node0004', true);

25.11 - SET_LOCATION_PERFORMANCE

Sets disk performance for a storage location.

Note

Before calling this function, call MEASURE_LOCATION_PERFORMANCE to obtain the location's throughput and average latency .

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SET_LOCATION_PERFORMANCE ( 'path', 'node' , 'throughput', 'average-latency')

Parameters

path: Specifies where the storage location to set is mounted.
node: Specifies the Vertica node where the location to set is available.
throughput: Specifies the throughput for the location, set to a value ≥1.
average-latency: Specifies the average latency for the location, set to a value ≥1.

Privileges

Superuser

Examples

The following example sets the performance of a storage location on node2 to a throughput of 122 megabytes per second and a latency of 140 seeks per second.

=> SELECT SET_LOCATION_PERFORMANCE('/secondVerticaStorageLocation/','node2','122','140');

25.12 - SET_OBJECT_STORAGE_POLICY

Creates or changes the storage policy of a database object by assigning it a labeled storage location.

Creates or changes the storage policy of a database object by assigning it a labeled storage location. The Tuple Mover uses this location to store new and existing data for this object. If the object already has an active storage policy, calling SET_OBJECT_STORAGE_POLICY sets this object's default storage to the new labeled location. Existing data for the object is moved to the new location.

Note

You cannot create a storage policy on a USER type storage location.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

SET_OBJECT_STORAGE_POLICY (
  '[[database.]schema.]object-name', 'location-label'
   [,'key-min', 'key-max'] [, 'enforce-storage-move' ] )

Parameters

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

object-name

Identifies the database object assigned to a labeled storage location. The object-name can resolve to a database, schema, or table.

location-label

The label of object-name's storage location.

key-min key-max

Valid only if object-name is a table, specifies the range of table partition key values to store at the labeled location.

enforce-storage-move

Specifies when the Tuple Mover moves all existing storage containers for object-name to the labeled storage location:

false (default): Move storage containers only after all pending mergeout tasks return.
true: Immediately move all storage containers to the new location.

Tip

You can also enforce all storage policies immediately by calling Vertica meta-function ENFORCE_OBJECT_STORAGE_POLICY

Privileges

One of the following:

Superuser
Object owner and access to its storage location.

Examples

See Clearing storage policies

26 - Table management functions

This section contains the functions associated with the Vertica library table management.

26.1 - COPY_TABLE

Copies one table to another.

Copies one table to another. This lightweight, in-memory function copies the DDL and all user-created projections from the source table. Projection statistics for the source table are also copied. Thus, the source and target tables initially have identical definitions and share the same storage.

Note

Although they share storage space, Vertica regards the tables as discrete objects for license capacity purposes. For example, a single-terabyte table and its copy initially consume only one TB of space. However, your Vertica license regards them as separate objects that consume two TB of space.

After the copy operation is complete, the source and copy tables are independent of each other, so you can perform DML operations on one table without impacting the other. These operations can increase the overall storage required for both tables.

Caution

If you create multiple copies of the same table concurrently, one or more of the copy operations is liable to fail. Instead, copy tables sequentially.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

COPY_TABLE (
    '[[database.]schema.]source-table',
    '[[database.]schema.]target-table'
)

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
*source-table*: The source table to copy. Vertica copies all data from this table to the target table.
*target-table*: The target table of the source table. If the target table already exists, Vertica appends the source to the existing table.
If the table does not exist, Vertica creates a table from the source table's definition, by calling CREATE TABLE with LIKE and INCLUDING PROJECTIONS clause. The new table inherits ownership from the source table. For details, see Replicating a table.

Privileges

Non-superuser:

Source table: SELECT
Target schema/table (new): CREATE
Target table (existing): INSERT

Table attribute requirements

The following attributes of both tables must be identical:

Column definitions, including NULL/NOT NULL constraints
Segmentation
Partitioning expression
Number of projections
Projection sort order
Primary and unique key constraints. However, the key constraints do not have to be identically enabled.

Note
If the target table has primary or unique key constraints enabled and moving the partitions will insert duplicate key values into the target table, Vertica rolls back the operation. Enforcing constraints requires disk reads and can slow the copy process.
Number and definitions of text indices.
If the destination table already exists, the source and destination tables must have identical access policies.

Additionally, If access policies exist on the source table, the following must be true:

Access policies on both tables must be identical.
One of the following must be true:
- The executing user owns the source table.
- AccessPolicyManagementSuperuserOnly is set to true. See Managing access policies for details.

Table restrictions

The following restrictions apply to the source and target tables:

If the source and target partitions are in different storage tiers, Vertica returns a warning but the operation proceeds. The partitions remain in their existing storage tier.
If the source table contains a sequence, Vertica converts the sequence to an integer before copying it to the target table. If the target table contains auto-increment, identity, or named sequence columns, Vertica cancels the copy and displays an error message.
The following tables cannot be used as sources or targets:
- Temporary tables
- Virtual tables
- System tables
- External tables

Examples

If you call COPY_TABLE and the target table does not exist, the function creates the table automatically. In the following example, COPY_TABLE creates the target table public.newtable. Vertica also copies all the constraints associated with the source table public.product_dimension except foreign key constraints:

=> SELECT COPY_TABLE ( 'public.product_dimension', 'public.newtable');
-[ RECORD 1 ]--------------------------------------------------
copy_table | Created table public.newtable.
Copied table public.product_dimension to public.newtable

26.2 - INFER_EXTERNAL_TABLE_DDL

This function is deprecated and will be removed in a future release.

Deprecated

This function is deprecated and will be removed in a future release. Instead, use INFER_TABLE_DDL.

Inspects a file in Parquet, ORC, or Avro format and returns a CREATE EXTERNAL TABLE AS COPY statement that can be used to read the file. This statement might be incomplete. It could also contain more columns or columns with longer names than what Vertica supports; this function does not enforce Vertica system limits. Always inspect the output and address any issues before using it to create a table.

This function supports partition columns for the ORC and Parquet formats only. Parquet and ORC files contain insufficient information to infer the type of partition columns, so this function shows these columns with a data type of UNKNOWN and emits a warning.

The function handles most data types, including complex types. If an input type is not supported in Vertica, the function emits a warning.

By default, the function uses strong typing for complex types. You can instead treat the column as a flexible complex type by setting the vertica_type_for_complex_type parameter to LONG VARBINARY.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

INFER_EXTERNAL_TABLE_DDL( path USING PARAMETERS param=value[,...] )

Arguments

path: Path to a file or directory. Any path that is valid for COPY and uses a file format supported by this function is valid.

Parameters

format: Input format (string), one of 'Parquet', 'ORC', or 'Avro'. This parameter is required.
table_name: The name of the external table to create. This parameter is required.
Do not include a schema name as part of the table name; use the table_schema parameter.
table_schema: The schema in which to create the external table. If omitted, the function does not include a schema in the output.
vertica_type_for_complex_type: Type used to represent all columns of complex types, if you do not want to expand them fully. The only supported value is LONG VARBINARY. For more information, see Flexible complex types.

Privileges

Non-superuser: READ privileges on the USER-accessible storage location.

Examples

In the following example, the input file contains data for a table with two integer columns. The table definition can be fully inferred, and you can use the returned SQL statement as-is.

=> SELECT INFER_EXTERNAL_TABLE_DDL('/data/orders/*.orc'
        USING PARAMETERS format = 'orc', table_name = 'orders');

                INFER_EXTERNAL_TABLE_DDL
--------------------------------------------------------------------------------------------------
create external table "orders" (
  "id" int,
  "quantity" int
) as copy from '/data/orders/*.orc' orc;
(1 row)

To create a table in a schema, use the table_schema parameter. Do not add it to the table name; the function treats it as a name with a period in it, not a schema.

The following example shows output with complex types. You can use the definition as-is or modify the VARCHAR sizes:

=> SELECT INFER_EXTERNAL_TABLE_DDL('/data/people/*.parquet'
        USING PARAMETERS format = 'parquet', table_name = 'employees');
WARNING 9311:  This generated statement contains one or more varchar/varbinary columns which default to length 80
                    INFER_EXTERNAL_TABLE_DDL
-------------------------------------------------------------------------
 create external table "employees"(
  "employeeID" int,
  "personal" Row(
    "name" varchar,
    "address" Row(
      "street" varchar,
      "city" varchar,
      "zipcode" int
    ),
    "taxID" int
  ),
  "department" varchar
 ) as copy from '/data/people/*.parquet' parquet;
(1 row)

In the following example, the input file contains a map in the "prods" column. You can read a map as an array of rows:

=> SELECT INFER_EXTERNAL_TABLE_DDL('/data/orders.parquet'
    USING PARAMETERS format='parquet', table_name='orders');
WARNING 9311:  This generated statement contains one or more varchar/varbinary columns which default to length 80
                INFER_EXTERNAL_TABLE_DDL
------------------------------------------------------------------------
 create external table "orders"(
  "orderkey" int,
  "custkey" int,
  "prods" Array[Row(
    "key" varchar,
    "value" numeric(12,2)
  )],
  "orderdate" date
 ) as copy from '/data/orders.parquet' parquet;
(1 row)

The following example uses partition columns. Types of partition column cannot be determined from the data and you must edit to specify the types. In this example, the date and region columns are in the data in addition to being partition columns, and so the table definition shows them twice:

=> SELECT INFER_EXTERNAL_TABLE_DDL('/data/sales/*/*/*
        USING PARAMETERS format = 'parquet', table_name = 'sales');
WARNING 9262: This generated statement is incomplete because of one or more unknown column types.
Fix these data types before creating the table
                INFER_EXTERNAL_TABLE_DDL
------------------------------------------------------------------------
 create external table "sales"(
  "tx_id" int,
  "date" UNKNOWN,
  "region" UNKNOWN
) as copy from '/data/sales/*/*/*' parquet(hive_partition_cols='date,region');
(1 row)

For VARCHAR and VARBINARY columns, this function does not specify a length. The Vertica default length for these types is 80 bytes. If the data values are longer, using this table definition unmodified could cause data to be truncated. Always review VARCHAR and VARBINARY columns to determine if you need to specify a length. This function emits a warning if the input file contains columns of these types:

WARNING 9311: This generated statement contains one or more varchar/varbinary columns which default to length 80

26.3 - INFER_TABLE_DDL

Inspects a file in Parquet, ORC, or Avro format and returns a CREATE TABLE or CREATE EXTERNAL TABLE statement based on its contents.

Inspects a file in Parquet, ORC, or Avro format and returns a CREATE TABLE or CREATE EXTERNAL TABLE statement based on its contents. If you use a glob to specify more than one file, the function inspects only one.

The returned statement might be incomplete if the file contains ambiguous or unknown data types. It could also contain more columns or columns with longer names than what Vertica supports; this function does not enforce Vertica system limits. Always inspect the output and address any issues before using it to create a table.

This function supports partition columns, inferred from the input path, for the ORC and Parquet formats only. Because partitioning is done through the directory structure, there is insufficient information to infer the type of partition columns. This function shows these columns with a data type of UNKNOWN and emits a warning. Partition columns apply only to external tables.

The function handles most data types, including complex types. If an input type is not supported in Vertica, the function emits a warning.

For VARCHAR and VARBINARY columns, this function does not specify a length. The Vertica default length for these types is 80 bytes. If the data values are longer, using the returned table definition unmodified could cause data to be truncated. Always review VARCHAR and VARBINARY columns to determine if you need to specify a length. This function emits a warning if the input file contains columns of these types:

WARNING 9311: This generated statement contains one or more varchar/varbinary columns which default to length 80

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

INFER_TABLE_DDL( path USING PARAMETERS param=value[,...] )

Arguments

path: Path to a file or glob. Any path that is valid for COPY and uses a file format supported by this function is valid. If a glob specifies more than one file, this function reads a single file.

Parameters

format: Input format (string), one of 'Parquet', 'ORC', or 'Avro'. This parameter is required.
table_name: The name of the table to create. This parameter is required.
Do not include a schema name as part of the table name; use the table_schema parameter.
table_schema: The schema in which to create the table. If omitted, the function does not include a schema in the output.
table_type: The type of table to create, either 'native' (the default) or 'external'.
with_copy_statement: For native tables, whether to include a COPY statement in addition to the CREATE TABLE statement. The default is false.

Privileges

Non-superuser: READ privileges on the USER-accessible storage location.

Examples

In the following example, the input path contains data for a table with two integer columns. The external table definition can be fully inferred, and you can use the returned SQL statement as-is. The function reads one file from the input path:

=> SELECT INFER_TABLE_DDL('/data/orders/*.orc'
        USING PARAMETERS format = 'orc', table_name = 'orders', table_type = 'external');

                INFER_TABLE_DDL
-----------------------------------------------------------------------------------
create external table "orders" (
  "id" int,
  "quantity" int
) as copy from '/data/orders/*.orc' orc;
(1 row)

To create a table in a schema, use the table_schema parameter. Do not add it to the table name; the function treats it as a name with a period in it, not a schema.

The following example shows output with complex types. You can use the definition as-is or modify the VARCHAR sizes:

=> SELECT INFER_TABLE_DDL('/data/people/*.parquet'
        USING PARAMETERS format = 'parquet', table_name = 'employees');
WARNING 9311:  This generated statement contains one or more varchar/varbinary columns which default to length 80
                    INFER_TABLE_DDL
-------------------------------------------------------------------------
 create table "employees"(
  "employeeID" int,
  "personal" Row(
    "name" varchar,
    "address" Row(
      "street" varchar,
      "city" varchar,
      "zipcode" int
    ),
    "taxID" int
  ),
  "department" varchar
 );
(1 row)

In the following example, the input file contains a map in the "prods" column. You can read a map as an array of rows:

=> SELECT INFER_TABLE_DDL('/data/orders.parquet'
    USING PARAMETERS format='parquet', table_name='orders');
WARNING 9311:  This generated statement contains one or more varchar/varbinary columns which default to length 80
                INFER_TABLE_DDL
------------------------------------------------------------------------
 create table "orders"(
  "orderkey" int,
  "custkey" int,
  "prods" Array[Row(
    "key" varchar,
    "value" numeric(12,2)
  )],
  "orderdate" date
 );
(1 row)

The following example returns the definition of a native table and the COPY statement:

=> SELECT INFER_TABLE_DDL('/data/orders/*.orc'
        USING PARAMETERS format = 'orc', table_name = 'orders',
                         table_type = 'native', with_copy_statement = true);

                INFER_TABLE_DDL
--------------------------------------------------------------------------------------------------
create table "orders" (
  "id" int,
  "quantity" int
);
copy "orders" from '/data/orders/*.orc' orc;

In the following example, the data contains one materialized column and two partition columns. The date and region columns are in the data in addition to being partition columns, and so the table definition shows them twice. Partition columns are always of unknown type:

=> SELECT INFER_TABLE_DDL('/data/sales/*/*/*
        USING PARAMETERS format = 'parquet', table_name = 'sales', table_type = 'external');
WARNING 9262: This generated statement is incomplete because of one or more unknown column types.
Fix these data types before creating the table
                INFER_TABLE_DDL
------------------------------------------------------------------------
 create external table "sales"(
  "tx_id" int,
  "date" UNKNOWN,
  "region" UNKNOWN
) as copy from '/data/sales/*/*/*' parquet(hive_partition_cols='date,region');
(1 row)

26.4 - PURGE_TABLE

This function was formerly named PURGE_TABLE_PROJECTIONS().

Note

This function was formerly named PURGE_TABLE_PROJECTIONS(). Vertica still supports the former function name.

Permanently removes deleted data from physical storage so disk space can be reused. You can purge historical data up to and including the Ancient History Mark epoch.

Purges all projections of the specified table. You cannot use this function to purge temporary tables.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

PURGE_TABLE ( '[[database.]schema.]table' )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table: The table to purge.

Privileges

Table owner
USAGE privilege on schema

Caution

PURGE_TABLE could temporarily take up significant disk space while the data is being purged.

Examples

The following example purges all projections for the store sales fact table located in the Vmart schema:

=> SELECT PURGE_TABLE('store.store_sales_fact');

26.5 - REBALANCE_TABLE

Synchronously rebalances data in the specified table.

A rebalance operation performs the following tasks:

Distributes data based on:
- User-defined fault groups, if specified
- Large cluster automatic fault groups
Redistributes database projection data across all nodes.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

REBALANCE_TABLE('[[database.]schema.]table-name')

Parameters

schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table-name: The table to rebalance.

Privileges

Superuser

When to rebalance

Rebalancing is useful or even necessary after you perform the following tasks:

Mark one or more nodes as ephemeral in preparation of removing them from the cluster.
Add one or more nodes to the cluster so that Vertica can populate the empty nodes with data.
Change the scaling factor of an elastic cluster, which determines the number of storage containers used to store a projection across the database.
Set the control node size or realign control nodes on a large cluster layout
Add nodes to or remove nodes from a fault group.

Tip

Examples

The following command shows how to rebalance data on the specified table.

=> SELECT REBALANCE_TABLE('online_sales.online_sales_fact');
REBALANCE_TABLE
-------------------
 REBALANCED
(1 row)

27 - Tuple mover functions

This section contains tuple mover functions specific to Vertica.

27.1 - DO_TM_TASK

Runs a (TM) operation and commits current transactions.

Runs a Tuple Mover (TM) operation and commits current transactions. You can limit this operation to a specific table or projection. When started using this function, the TM uses the GENERAL resource pool instead of the TM resource pool.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

DO_TM_TASK('task'[, '[[database.]schema.]{ table | projection}]' )

Parameters

*task*

Specifies one of the following tuple mover operations:

mergeout: Consolidates ROS containers and purges deleted records. For details, see Mergeout.
analyze_row_count: Collects a minimal set of statistics and aggregate row counts for the specified projections, and saves it in the database catalog. Collects the number of rows in the specified projection. If you specify a table name, DO_TM_TASK returns the row counts for all projections of that table. For details, see Analyzing row counts.
update_storage_catalog (recommended only for Eon Mode): Updates the catalog with metadata on bundled table data. For details, see Writing bundle metadata to the catalog.

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

table | projection

Applies task to the specified table or projection. If you specify a projection and it is not found, DO_TM_TASK looks for a table with that name and, if found, applies the task to it and all projections associated with it.

If you specify no table or projection, the task is applied to all database tables and their projections.

Privileges

Schema: USAGE
Table: One of INSERT, UPDATE, or DELETE

Examples

Perform mergeout on all projections of table t1:

=> SELECT DO_TM_TASK('mergeout', 't1');

28 - Workload management functions

This section contains workload management functions specific to Vertica.

28.1 - ANALYZE_WORKLOAD

Runs Workload Analyzer, a utility that analyzes system information held in system tables.

Workload Analyzer intelligently monitors the performance of SQL queries and workload history, resources, and configurations to identify the root causes for poor query performance. ANALYZE_WORKLOAD returns tuning recommendations for all events within the scope and time that you specify, from system table TUNING_RECOMMENDATIONS.

Tuning recommendations are based on a combination of statistics, system and data collector events, and database-table-projection design. Workload Analyzer recommendations can help you quickly and easily tune query performance.

See Workload analyzer recommendations for the common triggering conditions and recommendations.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

ANALYZE_WORKLOAD ( '[ scope ]' [, 'since-time' | save-data ] );

Parameters

scope

Specifies the catalog objects to analyze, as follows:

[[database.]schema.]table

If set to an empty string, Vertica returns recommendations for all database objects.

If you specify a database, it must be the current database.

since-time

Specifies the start time for the analysis time span, which continues up to the current system status, inclusive. If you omit this parameter, ANALYZE_WORKLOAD returns recommendations on events since the last time you called this function.

Note

You must explicitly cast strings to TIMESTAMP or TIMESTAMPTZ. For example:

SELECT ANALYZE_WORKLOAD('T1', '2010-10-04 11:18:15'::TIMESTAMPTZ);
SELECT ANALYZE_WORKLOAD('T1', TIMESTAMPTZ '2010-10-04 11:18:15');

save-data

Specifies whether to save returned values from ANALYZE_WORKLOAD:

false (default): Results are discarded.
true: Saves the results returned by ANALYZE_WORKLOAD. Subsequent calls to ANALYZE_WORKLOAD return results that start from the last invocation when results were saved. Object events preceding that invocation are ignored.

Return values

Returns aggregated tuning recommendations from TUNING_RECOMMENDATIONS.

Privileges

Superuser

Examples

See Getting tuning recommendations.

28.2 - CHANGE_CURRENT_STATEMENT_RUNTIME_PRIORITY

Changes the run-time priority of an active query.

Note

This function replaces deprecated function CHANGE_RUNTIME_PRIORITY.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CHANGE_CURRENT_STATEMENT_RUNTIME_PRIORITY(transaction-id, 'value')

Parameters

transaction-id: Identifies the transaction, obtained from the system table SESSIONS.
*value*: The RUNTIMEPRIORITY value: HIGH, MEDIUM, or LOW.

Privileges

Superuser: None
Non-superusers can only change the runtime priority of their own queries, and cannot raise the runtime priority of a query to a level higher than that of the resource pool.

Examples

See Changing runtime priority of a running query.

28.3 - CHANGE_RUNTIME_PRIORITY

Changes the run-time priority of a query that is actively running.

Changes the run-time priority of a query that is actively running. Note that, while this function is still valid, you should instead use CHANGE_CURRENT_STATEMENT_RUNTIME_PRIORITY to change run-time priority. CHANGE_RUNTIME_PRIORITY will be deprecated in a future release of Vertica.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

CHANGE_RUNTIME_PRIORITY(TRANSACTION_ID,STATEMENT_ID, 'value')

Parameters

TRANSACTION_ID

An identifier for the transaction within the session.

TRANSACTION_ID cannot be NULL.

You can find the transaction ID in the Sessions table.

STATEMENT_ID

A unique numeric ID assigned by the Vertica catalog, which identifies the currently executing statement.

You can find the statement ID in the Sessions table.

You can specify NULL to change the run-time priority of the currently running query within the transaction.

'value'

The RUNTIMEPRIORITY value. Can be HIGH, MEDIUM, or LOW.

Privileges

No special privileges required. However, non-superusers can change the run-time priority of their own queries only. In addition, non-superusers can never raise the run-time priority of a query to a level higher than that of the resource pool.

Examples

=> SELECT CHANGE_RUNTIME_PRIORITY(45035996273705748, NULL, 'low');

28.4 - MOVE_STATEMENT_TO_RESOURCE_POOL

Attempts to move the specified query to the specified target pool.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Syntax

MOVE_STATEMENT_TO_RESOURCE_POOL (session_id , transaction_id, statement_id, target_resource_pool_name)

Parameters

session_id: Identifier for the session where the query you want to move is currently executing.
transaction_id: Identifier for the transaction within the session.
statement_id: Unique numeric ID for the statement you want to move.
target_resource_pool_name: Name of the existing resource pool to which you want to move the specified query.

Outputs

The function may return the following results:

MOV_REPLAN: Target pool does not have sufficient resources. See v_monitor.resource_pool_move for details. Vertica will attempt to replan the statement on target pool.

MOV_REPLAN: Target pool has priority HOLD. Vertica will attempt to replan the statement on target pool.

MOV_FAILED: Statement not found.

MOV_NO_OP: Statement already on target pool.

MOV_REPLAN: Statement is in queue. Vertica will attempt to replan the statement on target pool.

MOV_SUCC: Statement successfully moved to target pool.

Privileges

Superuser

Examples

The following example shows how you can move a specific statement to a resource pool called my_target_pool:

=> SELECT MOVE_STATEMENT_TO_RESOURCE_POOL ('v_vmart_node0001.example.-31427:0x82fbm', 45035996273711993, 1, 'my_target_pool');

28.5 - SLEEP

Waits a specified number of seconds before executing another statement or command.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type