Drops a license key from the global catalog. Dropping expired keys is optional. Vertica automatically ignores expired license keys if a valid, alternative license key is installed.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

DROP_LICENSE( 'license-name' )

Parameters

license-name: The name of the license to drop. Use the name (or long license key) in the NAME column of system table LICENSES.

Privileges

Superuser

Examples

=> SELECT DROP_LICENSE('9b2d81e2-aab1-4cfb-bc07-fa9a696e8f5e');

1.2 - DUMP_CATALOG

Returns an internal representation of the Vertica catalog.

Returns an internal representation of the Vertica catalog. This function is used for diagnostic purposes.

DUMP_CATALOG returns only the objects that are visible to the user.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

DUMP_CATALOG()

Privileges

None

Examples

The following query obtains an internal representation of the Vertica catalog:

=> SELECT DUMP_CATALOG();

The output is written to the specified file:

\o /tmp/catalog.txt
SELECT DUMP_CATALOG();
\o

1.3 - EXPORT_CATALOG

This function and EXPORT_OBJECTS return equivalent output.

Note

This function and EXPORT_OBJECTS return equivalent output.

Generates a SQL script for recreating a physical schema design on another cluster.

The SQL script conforms to the following requirements:

Only includes objects to which the user has access.
Orders CREATE statements according to object dependencies so they can be recreated in the correct sequence. For example, if a table is in a non-PUBLIC schema, the required CREATE SCHEMA statement precedes the CREATE TABLE statement. Similarly, a table's CREATE ACCESS POLICY statement follows the table's CREATE TABLE statement.
If possible, creates projections with their KSAFE clause, if any, otherwise with their OFFSET clause.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

EXPORT_CATALOG ( [ '[destination]' [, 'scope'] ] )

Arguments

Note

If you omit all arguments, this function exports to standard output all objects to which you have access.

destination

Specifies where to send output, one of the following:

Empty string, writes the script to standard output.
Path and name of an SQL output file, valid only for superusers. If you specify a file that does not exist, the function creates one. If you specify only a file name, Vertica creates it in the catalog directory. If the file already exists, the function silently overwrites its contents.

scope

Determines what to export. Within the specified scope, EXPORT_CATALOG exports all the objects to which you have access:

DESIGN: Exports all catalog objects, including schemas, tables, constraints, views, access policies, projections, SQL macros, and stored procedures.
DESIGN_ALL: Deprecated.
TABLES: Exports all tables and their access policies. See also EXPORT_TABLES.
DIRECTED_QUERIES: Exports all directed queries that are stored in the database. For details, see Managing directed queries.

Default: DESIGN

Privileges

None

Examples

See Exporting the catalog.

1.4 - EXPORT_OBJECTS

This function and EXPORT_CATALOG return equivalent output.

Note

This function and EXPORT_CATALOG return equivalent output.

Generates a SQL script you can use to recreate non-virtual catalog objects on another cluster.

The SQL script conforms to the following requirements:

Only includes objects to which the user has access.
Orders CREATE statements according to object dependencies so they can be recreated in the correct sequence. For example, if a table is in a non-PUBLIC schema, the required CREATE SCHEMA statement precedes the CREATE TABLE statement. Similarly, a table's CREATE ACCESS POLICY statement follows the table's CREATE TABLE statement.
If possible, creates projections with their KSAFE clause, if any, otherwise with their OFFSET clause.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

EXPORT_OBJECTS( ['[destination]' [, '[scope]'] [, 'mark-ksafe']] )

Parameters

Note

If you omit all parameters, this function exports to standard output all objects to which you have access.

destination

Specifies where to send output, one of the following:

An empty string ('') writes the script to standard output.
The path and name of a SQL output file. This option is valid only for superusers. If you specify a file that does not exist, the function creates one. If you specify only a file name, Vertica creates it in the catalog directory. If the file already exists, the function silently overwrites its contents.

scope

Specifies one or more objects to export as a comma-delimited list:

{ [database.]schema[.object] | [[database.]schema.]object }[,...]

If set to an empty string, Vertica exports all objects to which the user has access.
If you specify a schema only, Vertica exports all objects in that schema.
If you specify a database, it must be the current database.

For stored procedures with the same name but different formal parameters, you can export all implementations by exporting its parent schema:

mydb.myschema

Specifying the types or both the names and types of a particular implementation's formal parameters exports that implementation:

mydb.myschema.my_procedure()  -- no formal parameters
mydb.myschema.my_procedure(int, int) -- formal parameter types (parameter names are optional)

mark-ksafe

Boolean argument, specifies whether the generated script calls the Vertica function MARK_DESIGN_KSAFE . If set to true (default), MARK_DESIGN_KSAFE uses the correct K-safe argument for the current database.

Privileges

None

Examples

See Exporting objects.

1.5 - EXPORT_TABLES

Generates a SQL script that can be used to recreate a logical schema—schemas, tables, constraints, and views—on another cluster.

Generates a SQL script that can be used to recreate a logical schema—schemas, tables, constraints, and views—on another cluster. EXPORT_TABLES only exports objects to which the user has access.

The SQL script conforms to the following requirements:

Only includes objects to which the user has access.
Orders CREATE statements according to object dependencies so they can be recreated in the correct sequence. For example, if a table references a named sequence, a CREATE SEQUENCE statement precedes the CREATE TABLE statement. Similarly, a table's CREATE ACCESS POLICY statement follows the table's CREATE TABLE statement.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

EXPORT_TABLES( ['[destination]' [, '[scope]']] )

Note

If you omit all parameters, EXPORT_CATALOG exports to standard output all tables to which you have access.

Parameters

destination

Specifies where to send output, one of the following:

An empty string ('') writes the script to standard output.
The path and name of a SQL output file. This option is valid only for superusers. If you specify a file that does not exist, the function creates one. If you specify only a file name, Vertica creates it in the catalog directory. If the file already exists, the function silently overwrites its contents.

scope

Specifies one or more tables to export, as follows:

[database.]schema[.table][,...]

If set to an empty string, Vertica exports all non-virtual table objects to which you have access, including table schemas, sequences, and constraints.
If you specify a schema, Vertica exports all non-virtual table objects in that schema.
If you specify a database, it must be the current database.

Privileges

None

Examples

See Exporting tables.

1.6 - INSTALL_LICENSE

Installs the license key in the global catalog.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

INSTALL_LICENSE( 'filename' )

Parameters

filename: The absolute path name of a valid license file.

Privileges

Superuser

Examples

=> SELECT INSTALL_LICENSE('/tmp/vlicense.dat');

1.7 - MARK_DESIGN_KSAFE

Enables or disables high availability in your environment, in case of a failure.

Enables or disables high availability in your environment, in case of a failure. Before enabling recovery, MARK_DESIGN_KSAFE queries the catalog to determine whether a cluster's physical schema design meets the following requirements:

Small, unsegmented tables are replicated on all nodes.
Large table superprojections are segmented with each segment on a different node.
Each large table projection has at least one buddy projection for K-safety=1 (or two buddy projections for K-safety=2).

Buddy projections are also segmented across database nodes, but the distribution is modified so segments that contain the same data are distributed to different nodes. See High availability with projections.

MARK_DESIGN_KSAFE does not change the physical schema.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

MARK_DESIGN_KSAFE ( k )

Parameters

k

Specifies the level of K-safety, one of the following:

2: Enables high availability if the schema design meets requirements for K-safety=2
1: Enables high availability if the schema design meets requirements for K-safety=1
0: Disables high availability

Privileges

Superuser

Return messages

If you specify a k value of 1 or 2, Vertica returns one of the following messages.

Success:

 Marked design n-safe

Failure:

 The schema does not meet requirements for K=n.
 Fact table projection projection-name
 has insufficient "buddy" projections.

where n is a K-safety setting.

Notes

The database's internal recovery state persists across database restarts but it is not checked at startup time.
When one node fails on a system marked K-safe=1, the remaining nodes are available for DML operations.

Examples

=> SELECT MARK_DESIGN_KSAFE(1);
  mark_design_ksafe
----------------------
 Marked design 1-safe
(1 row)

If the physical schema design is not K-safe, messages indicate which projections do not have a buddy:

=> SELECT MARK_DESIGN_KSAFE(1);
The given K value is not correct;
the schema is 0-safe
Projection pp1 has 0 buddies,
which is smaller that the given K of 1
Projection pp2 has 0 buddies,
which is smaller that the given K of 1
.
.
.
(1 row)

1.8 - RELOAD_ADMINTOOLS_CONF

Updates the admintools.conf on each UP node in the cluster.

Updates the admintools.conf on each UP node in the cluster. Updates include:

IP addresses and catalog paths
Node names for all nodes in the current database

This function provides a manual method to instruct the server to update admintools.conf on all UP nodes. For example, if you restart a node, call this function to confirm its admintools.conf file is accurate.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

RELOAD_ADMINTOOLS_CONF()

Privileges

Superuser

Examples

Update admintools.conf on each UP node in the cluster:

=> SELECT RELOAD_ADMINTOOLS_CONF();
  RELOAD_ADMINTOOLS_CONF
--------------------------
 admintools.conf reloaded
(1 row)

2 - Cloud functions

This section contains functions for managing cloud integrations.

This section contains functions for managing cloud integrations. See also Hadoop functions for HDFS.

2.1 - AZURE_TOKEN_CACHE_CLEAR

Clears the cached access token for Azure.

Clears the cached access token for Azure. Call this function after changing the configuration of Azure managed identities.

An Azure object store can support and manage multiple identities. If multiple identities are in use, Vertica looks for an Azure tag with a key of VerticaManagedIdentityClientId, the value of which must be the client_id attribute of the managed identity to be used. If the Azure configuration changes, use this function to clear the cache.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

AZURE_TOKEN_CACHE_CLEAR ( )

Privileges

Superuser

3 - Cluster functions

This section contains functions that manage deployment on large, distributed database clusters and functions that control how the cluster organizes data for rebalancing.

This section contains functions that manage spread deployment on large, distributed database clusters and functions that control how the cluster organizes data for rebalancing.

3.1 - CANCEL_REBALANCE_CLUSTER

Stops any rebalance task that is currently in progress or is waiting to execute.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CANCEL_REBALANCE_CLUSTER()

Privileges

Superuser

Examples

=> SELECT CANCEL_REBALANCE_CLUSTER();
 CANCEL_REBALANCE_CLUSTER
--------------------------
 CANCELED
(1 row)

3.2 - DISABLE_LOCAL_SEGMENTS

Disables local data segmentation, which breaks projections segments on nodes into containers that can be easily moved to other nodes.

Disables local data segmentation, which breaks projections segments on nodes into containers that can be easily moved to other nodes. See Local data segmentation for details.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

DISABLE_LOCAL_SEGMENTS()

Privileges

Superuser

Examples

=> SELECT DISABLE_LOCAL_SEGMENTS();
 DISABLE_LOCAL_SEGMENTS
------------------------
 DISABLED
(1 row)

3.3 - ENABLE_ELASTIC_CLUSTER

Enables elastic cluster scaling, which makes enlarging or reducing the size of your database cluster more efficient by segmenting a node's data into chunks that can be easily moved to other hosts.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

ENABLE_ELASTIC_CLUSTER()

Privileges

Superuser

Examples

=> SELECT ENABLE_ELASTIC_CLUSTER();
 ENABLE_ELASTIC_CLUSTER
------------------------
 ENABLED
(1 row)

3.4 - ENABLE_LOCAL_SEGMENTS

Enables local storage segmentation, which breaks projections segments on nodes into containers that can be easily moved to other nodes.

Enables local storage segmentation, which breaks projections segments on nodes into containers that can be easily moved to other nodes. See Local data segmentation for more information.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

ENABLE_LOCAL_SEGMENTS()

Privileges

Superuser

Examples

=> SELECT ENABLE_LOCAL_SEGMENTS();
 ENABLE_LOCAL_SEGMENTS
-----------------------
 ENABLED
(1 row)

3.5 - REALIGN_CONTROL_NODES

Causes Vertica to re-evaluate which nodes in the cluster or subcluster are and which nodes are assigned to them as dependents when large cluster is enabled.

Causes Vertica to re-evaluate which nodes in the cluster or subcluster are control nodes and which nodes are assigned to them as dependents when large cluster is enabled. Call this function after altering fault groups in an Enterprise Mode database, or changing the number of control nodes in either database mode. After calling this function, query the V_CATALOG.CLUSTER_LAYOUT system table to see the proposed new layout for nodes in the cluster. You must also take additional steps before the new control node assignments take effect. See Changing the number of control nodes and realigning for details.

Note

In Vertica versions prior to 10.0.1, control node assignments weren't restricted to be within the same Eon Mode subcluster. If you attempt to realign control nodes in a subcluster whose control nodes have dependents in other subclusters, this function returns an error. In this case, you must realign the control nodes in those other subclusters first. Realigning the other subclusters fixes the cross-subcluster dependencies, allowing you to realign the control nodes in the original subcluster you attempted to realign.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

In Enterprise Mode:

REALIGN_CONTROL_NODES()

In Eon Mode:

REALIGN_CONTROL_NODES('subcluster_name')

Parameters

subcluster_name: The name of the subcluster where you want to realign control nodes. Only the nodes in this subcluster are affected. Other subclusters are unaffected. Only allowed when the database is running in Eon Mode.

Privileges

Superuser

Examples

In an Enterprise Mode database, choose control nodes from all nodes and assign the remaining nodes to a control node:

=> SELECT REALIGN_CONTROL_NODES();

In an Eon Mode database, re-evaluate the control node assignments in the subcluster named analytics:

=> SELECT REALIGN_CONTROL_NODES('analytics');

3.6 - REBALANCE_CLUSTER

Rebalances the database cluster synchronously as a session foreground task.

Rebalances the database cluster synchronously as a session foreground task. REBALANCE_CLUSTER returns only after the rebalance operation is complete. If the current session ends, the operation immediately aborts. To rebalance the cluster as a background task, call START_REBALANCE_CLUSTER.

On large cluster arrangements, you typically call REBALANCE_CLUSTER in a flow (see Changing the number of control nodes and realigning). After you change the number and distribution of control nodes (spread hosts), run REBALANCE_CLUSTER to achieve fault tolerance.

For detailed information about rebalancing tasks, see Rebalancing data across nodes.

Tip

By default, before performing a rebalance, Vertica queries system tables to compute the size of all projections involved in the rebalance task. This query can add significant overhead to the rebalance operation. To disable this query, set projection configuration parameter RebalanceQueryStorageContainers to 0.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

REBALANCE_CLUSTER()

Privileges

Superuser

Examples

=> SELECT REBALANCE_CLUSTER();
REBALANCE_CLUSTER
-------------------
 REBALANCED
(1 row)

3.7 - RELOAD_SPREAD

Updates cluster changes to the catalog's Spread configuration file.

Updates cluster changes to the catalog's Spread configuration file. These changes include:

New or realigned control nodes
New Spread hosts or fault group
New or dropped cluster nodes

This function is often used in a multi-step process for large and elastic cluster arrangements. Calling it might require you to restart the database. You must then rebalance the cluster to realize fault tolerance. For details, see Defining and Realigning Control Nodes.

Caution

In an Eon Mode database, using this function could result in the database becoming read-only. Nodes may become disconnected after you call this function. If the database no longer has primary shard coverage without these nodes, it goes into read-only mode to maintain data integrity. Once the nodes rejoin the cluster, the database will resume normal operation. See Maintaining Shard Coverage.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

RELOAD_SPREAD( true )

Parameters

true: Updates cluster changes related to control message responsibilities to the Spread configuration file.

Privileges

Superuser

Examples

Update the cluster with changes to control messaging:

=> SELECT reload_spread(true);
 reload_spread
---------------
 reloaded
(1 row)

3.8 - SET_CONTROL_SET_SIZE

Sets the number of that participate in the spread service when large cluster is enabled.

Sets the number of control nodes that participate in the spread service when large cluster is enabled. If the database is running in Enterprise Mode, this function sets the number of control nodes for the entire database cluster. If the database is running in Eon Mode, this function sets the number of control nodes in the subcluster you specify. See Large cluster for more information.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

In Enterprise Mode:

SET_CONTROL_SET_SIZE( control_nodes )

In Eon Mode:

SET_CONTROL_SET_SIZE('subcluster_name', control_nodes )

Parameters

subcluster_name

The name of the subcluster where you want to set the number of control nodes. Only allowed when the database is running in Eon Mode.

control_nodes

The number of control nodes to assign to the cluster (when in Enterprise Mode) or subcluster (when in Eon Mode). Value can be one of the following:

Positive integer value: Vertica assigns the number of control nodes you specify to the cluster or subcluster. This value can be larger than the current node count. This value cannot be larger than 120 (the maximum number of control nodes for a database). In Eon Mode, the total of this value plus the number of control nodes set for all other subclusters cannot be more than 120.
-1: Makes every node in the cluster or subcluster into control nodes. This value effectively disables large cluster for the cluster or subcluster.

Privileges

Superuser

Examples

In an Enterprise Mode database, set the number of control nodes for the entire cluster to 5:

=> SELECT set_control_set_size(5);
 SET_CONTROL_SET_SIZE
----------------------
 Control size set
(1 row)

3.9 - SET_SCALING_FACTOR

Sets the scaling factor that determines the number of storage containers used when rebalancing the database and when using local data segmentation is enabled.

Sets the scaling factor that determines the number of storage containers used when rebalancing the database and when using local data segmentation is enabled. See Cluster Scaling for details.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SET_SCALING_FACTOR( factor )

Parameters

factor: An integer value between 1 and 32. Vertica uses this value to calculate the number of storage containers each projection is broken into when rebalancing or when local data segmentation is enabled.

Privileges

Superuser

Best practices

The scaling factor determines the number of storage containers that Vertica uses to store each projection across the database during rebalancing when local segmentation is enabled. When setting the scaling factor, follow these guidelines:

The number of storage containers should be greater than or equal to the number of partitions multiplied by the number of local segments:

num-storage-containers>= (num-partitions*num-local-segments )
Set the scaling factor high enough so rebalance can transfer local segments to satisfy the skew threshold, but small enough so the number of storage containers does not result in too many ROS containers, and cause ROS pushback. The maximum number of ROS containers (by default 1024) is set by configuration parameter ContainersPerProjectionLimit.

Examples

=> SELECT SET_SCALING_FACTOR(12);
 SET_SCALING_FACTOR
--------------------
 SET
(1 row)

3.10 - START_REBALANCE_CLUSTER

Asynchronously rebalances the database cluster as a background task.

Asynchronously rebalances the database cluster as a background task. This function returns immediately after the rebalancing operation is complete. Rebalancing persists until the operation is complete, even if you close the current session or the database shuts down. In the case of shutdown, rebalancing resumes after the cluster restarts. To stop the rebalance operation, call CANCEL_REBALANCE_CLUSTER.

For detailed information about rebalancing tasks, see Rebalancing data across nodes.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

START_REBALANCE_CLUSTER()

Privileges

Superuser

Examples

=> SELECT START_REBALANCE_CLUSTER();
 START_REBALANCE_CLUSTER
-------------------------
 REBALANCING
(1 row)

4 - Data collector functions

The Vertica Data Collector is a utility that extends system table functionality by providing a framework for recording events.

The Vertica Data Collector is a utility that extends system table functionality by providing a framework for recording events. It gathers and retains monitoring information about your database cluster and makes that information available in system tables, requiring few configuration parameter tweaks, and having negligible impact on performance.

Collected data is stored on disk in the DataCollector directory under the Vertica /catalog path. You can use the information the Data Collector retains to query the past state of system tables and extract aggregate information, as well as do the following:

See what actions users have taken
Locate performance bottlenecks
Identify potential improvements to Vertica configuration

Data Collector works in conjunction with an advisor tool called Workload Analyzer, which intelligently monitors the performance of SQL queries and workloads and recommends tuning actions based on observations of the actual workload history.

By default, Data Collector is on and retains information for all sessions. If performance issues arise, a superuser can disable Data Collector by setting set configuration parameter EnableDataCollector to 0.

4.1 - CLEAR_DATA_COLLECTOR

Clears all memory and disk records from Data Collector tables and logs, and resets collection statistics in system table DATA_COLLECTOR.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CLEAR_DATA_COLLECTOR( [ 'component' ] )

Parameters

component

Clears memory and disk records for the specified component. If you provide no argument, the function clears memory and disk records for all components.

Query system table DATA_COLLECTOR for component names. For example:

=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
   component    |          description
----------------+-------------------------------
 DepotEvictions | Files evicted from the Depot
 DepotFetches   | Files fetched to the Depot
 DepotUploads   | Files Uploaded from the Depot
(3 rows)

Privileges

Superuser

Examples

The following command clears memory and disk records for the ResourceAcquisitions component:

=> SELECT clear_data_collector('ResourceAcquisitions');
 clear_data_collector
----------------------
 CLEAR
(1 row)

The following command clears data collection for all components:

=> SELECT clear_data_collector();
 clear_data_collector
----------------------
 CLEAR
(1 row)

4.2 - DATA_COLLECTOR_HELP

Returns online usage instructions about the Data Collector, the V_MONITOR.DATA_COLLECTOR system table, and the Data Collector control functions.

Returns online usage instructions about the Data Collector, the DATA_COLLECTOR system table, and the Data Collector control functions.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

DATA_COLLECTOR_HELP()

Privileges

None

Returns

The DATA_COLLECTOR_HELP() function returns the following information:

=> SELECT DATA_COLLECTOR_HELP();

-----------------------------------------------------------------------------
Usage Data Collector
The data collector retains history of important system activities.
   This data can be used as a reference of what actions have been taken
      by users, but it can also be used to locate performance bottlenecks,
      or identify potential improvements to the Vertica configuration.
   This data is queryable via Vertica system tables.
Acccess a list of data collector components, and some statistics, by running:
   SELECT * FROM v_monitor.data_collector;

The amount of data retained by size and time can be controlled with several
functions.
   To just set the size amount:
      set_data_collector_policy(<component>,
                                <memory retention (KB)>,
                                <disk retention (KB)>);

   To set both the size and time amounts (the smaller one will dominate):
      set_data_collector_policy(<component>,
                                <memory retention (KB)>,
                                <disk retention (KB)>,
                                <interval>);

   To set just the time amount:
      set_data_collector_time_policy(<component>,
                                     <interval>);

   To set the time amount for all tables:
      set_data_collector_time_policy(<interval>);

The current retention policy for a component can be queried with:
   get_data_collector_policy(<component>);

Data on disk is kept in the "DataCollector" directory under the Vertica
\catalog path. This directory also contains instructions on how to load
the monitoring data into another Vertica database.

To move the data collector logs and instructions to other storage locations,
create labeled storage locations using add_location and then use:

   set_data_collector_storage_location(<storage_label>);

Additional commands can be used to configure the data collection logs.
The log can be cleared with:
clear_data_collector([<optional component>]);
The log can be synchronized with the disk storage using:
flush_data_collector([<optional component>]);

4.3 - FLUSH_DATA_COLLECTOR

Waits until memory logs are moved to disk and then flushes the Data Collector, synchronizing the log with disk storage.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

FLUSH_DATA_COLLECTOR( [ 'component' ] )

Parameters

component

Flushes data for the specified component. If you omit this argument, the function flushes data for all components.

Query system table DATA_COLLECTOR for component names. For example:

=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
   component    |          description
----------------+-------------------------------
 DepotEvictions | Files evicted from the Depot
 DepotFetches   | Files fetched to the Depot
 DepotUploads   | Files Uploaded from the Depot
(3 rows)

Privileges

Superuser

Examples

The following command flushes the Data Collector for the ResourceAcquisitions component:

=> SELECT flush_data_collector('ResourceAcquisitions');
 flush_data_collector
----------------------
 FLUSH
(1 row)

The following command flushes data collection for all components:

=> SELECT flush_data_collector();
 flush_data_collector
----------------------
 FLUSH
(1 row)

4.4 - GET_DATA_COLLECTOR_POLICY

Retrieves a brief statement about the retention policy for the specified component.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

GET_DATA_COLLECTOR_POLICY( 'component' )

Parameters

component

Returns the retention policy of the specified component.

Query system table DATA_COLLECTOR for component names. For example:

=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
   component    |          description
----------------+-------------------------------
 DepotEvictions | Files evicted from the Depot
 DepotFetches   | Files fetched to the Depot
 DepotUploads   | Files Uploaded from the Depot
(3 rows)

Privileges

None

Examples

The following query returns the history of all resource acquisitions by specifying the ResourceAcquisitions component:

=> SELECT get_data_collector_policy('ResourceAcquisitions');
          get_data_collector_policy
----------------------------------------------
 1000KB kept in memory, 10000KB kept on disk.
(1 row)

4.5 - SET_DATA_COLLECTOR_POLICY

Updates the following retention policy properties for the specified component:.

Updates the following retention policy properties for the specified component:

MEMORY_BUFFER_SIZE_KB
DISK_SIZE_KB
INTERVAL_TIME

Before you change a retention policy, you can view its current settings by querying system table DATA_COLLECTOR or by calling meta-function GET_DATA_COLLECTOR_POLICY.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SET_DATA_COLLECTOR_POLICY('component', 'memory-buffer-size', 'disk-size' [,'interval-time']  )

Parameters

component

Specifies the retention policy to update.

Query system table DATA_COLLECTOR for component names. For example:

=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
   component    |          description
----------------+-------------------------------
 DepotEvictions | Files evicted from the Depot
 DepotFetches   | Files fetched to the Depot
 DepotUploads   | Files Uploaded from the Depot
(3 rows)

memory-buffer-size

Specifies in kilobytes the maximum amount of data that is buffered in memory before moving it to disk. The policy retention policy property MEMORY_BUFFER_SIZE_KB is set from this value.

Caution

If you set this parameter to 0, the function returns with a warning that the Data Collector cannot retain any data for this component in memory or on disk.

Consider setting this parameter to a high value in the following cases:

Unusually high levels of data collection. If memory-buffer-size is set too low, the Data Collector might be unable to flush buffered data to disk fast enough to keep up with the activity level, which can lead to loss of in-memory data.
Very large data collector records—for example, records with very long query strings. The Data Collector uses double-buffering, so it cannot retain in memory records that are more than 50 percent larger than memory-buffer-size.

disk-size

Specifies in kilobytes the maximum disk space allocated for this component's Data Collector table. The policy retention policy property DISK_SIZE_KB is set from this value. If set to 0, the Data Collector retains only as much component data as it can buffer in memory, as specified by memory-buffer-size.

interval-time

INTERVAL data type that specifies how long data of a given component is retained in that component's Data Collector table. The retention policy property INTERVAL_TIME is set from this value. If you set this parameter to a positive value, it also changes the policy property INTERVAL_SET to t (true).

For example, if you specify component TupleMoverEvents and set interval-time to an interval of two days ('2 days'::interval), the Data Collector table dc_tuple_mover_events retains records of Tuple Mover activity over the last 48 hours. Older Tuple Mover data are automatically dropped from this table.

Note

Setting a component's policy's INTERVAL_TIME property has no effect on how much data storage the Data Collector retains on disk for that component. Maximum disk storage capacity is determined by the DISK_SIZE_KB property. Setting the INTERVAL_TIME property only affects how long data is retained by the component's Data Collector table. For details, see Configuring data retention policies.

To disable the INTERVAL_TIME policy property, set this parameter to a negative integer. Doing so reverts two retention policy properties to their default settings:

INTERVAL_SET: f
INTERVAL_TIME: 0

With these two properties thus set, the component's Data Collector table retains data on all component events until it reaches its maximum limit, as set by retention policy property DISK_SIZE_KB.

Privileges

Superuser

Examples

See Configuring data retention policies.

4.6 - SET_DATA_COLLECTOR_TIME_POLICY

Updates the retention policy property INTERVAL_TIME for the specified component.

Updates the retention policy property INTERVAL_TIME for the specified component. Calling this function has no effect on other properties of the same component. You can use this function to update the INTERVAL_TIME property of all component retention policies.

To set other retention policy properties, call SET_DATA_COLLECTOR_POLICY.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SET_DATA_COLLECTOR_TIME_POLICY( ['component',] 'interval-time' )

Parameters

component

Specifies the retention policy to update. If you omit this argument, Vertica updates the retention policy of all Data Collector components.

Query system table DATA_COLLECTOR for component names. For example:

=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
   component    |          description
----------------+-------------------------------
 DepotEvictions | Files evicted from the Depot
 DepotFetches   | Files fetched to the Depot
 DepotUploads   | Files Uploaded from the Depot
(3 rows)

interval-time

Note

To disable the INTERVAL_TIME policy property, set this parameter to a negative integer. Doing so reverts two retention policy properties to their default settings:

INTERVAL_SET: f
INTERVAL_TIME: 0

With these two properties thus set, the component's Data Collector table retains data on all component events until it reaches its maximum limit, as set by retention policy property DISK_SIZE_KB.

Privileges

Superuser

Examples

See Configuring data retention policies.

5 - Database functions

This section contains the database management functions specific to Vertica.

5.1 - CLEAR_RESOURCE_REJECTIONS

Clears the content of the RESOURCE_REJECTIONS and DISK_RESOURCE_REJECTIONS system tables.

Clears the content of the RESOURCE_REJECTIONS and DISK_RESOURCE_REJECTIONS system tables. Normally, these tables are only cleared during a node restart. This function lets you clear the tables whenever you need. For example, you might want to clear the system tables after you resolved a disk space issue that was causing disk resource rejections.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Immutable

Syntax

CLEAR_RESOURCE_REJECTIONS();

Privileges

Superuser

Examples

The following command clears the content of the RESOURCE_REJECTIONS and DISK_RESOURCE_REJECTIONS system tables:

=> SELECT clear_resource_rejections();
clear_resource_rejections
---------------------------
 OK
(1 row)

5.2 - COMPACT_STORAGE

Bundles existing data (.fdb) and index (.pidx) files into the .gt file format.

Bundles existing data (.fdb) and index (.pidx) files into the .gt file format. The .gt format is enabled by default for data files created version 7.2 or later. If you upgrade a database from an earlier version, use COMPACT_STORAGE to bundle storage files into the .gt format. Your database can continue to operate with a mix of file storage formats.

If the settings you specify for COMPACT_STORAGE vary from the limit specified in configuration parameter MaxBundleableROSSizeKB, Vertica does not change the size of the automatically created bundles.

Note

Run this function during periods of low demand.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SELECT COMPACT_STORAGE ('[[[database.]schema.]object-name]', min-ros-filesize-kb, 'small-or-all-files', 'simulate');

Parameters

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

object-name

Specifies the table or projection to bundle. If set to an empty string, COMPACT_STORAGE evaluates the data of all projections in the database for bundling.

min-ros-filesize-kb

Integer ≥ 1, specifies in kilobytes the minimum size of an independent ROS file. COMPACT_STORAGE bundles storage container ROS files below this size into a single file.

small-or-all-files

One of the following:

small: Bundles only files smaller than the limit specified in min-ros-filesize-kb
all: Bundles files smaller than the limit specified in min-ros-filesize-kb and bundles the .fdb and .pidx files for larger storage containers.

simulate

Specifies whether to simulate the storage settings and produce a report describing the impact of those settings.

true: Produces a report on the impact of the specified bundle settings without actually bundling storage files.
false: Performs the bundling as specified.

Privileges

Superuser

Storage and performance impact

Bundling reduces the number of files in your file system by at least fifty percent and improves the performance of file-intensive operations. Improved operations include backups, restores, and mergeout.

Vertica creates small files for the following reasons:

Tables contain hundreds of columns.
Partition ranges are small (partition by minute).
Local segmentation is enabled and your factor is set to a high value.

Examples

The following example describes the impact of bundling the table EMPLOYEES:

=> SELECT COMPACT_STORAGE('employees', 1024,'small','true');
Task: compact_storage

On node v_vmart_node0001:
Projection Name :public.employees_b0 | selected_storage_containers :0 |
selected_files_to_compact :0 | files_after_compact : 0 | modified_storage_KB :0

On node v_vmart_node0002:
Projection Name :public.employees_b0 | selected_storage_containers :1 |
selected_files_to_compact :6 | files_after_compact : 1 | modified_storage_KB :0

On node v_vmart_node0003:
Projection Name :public.employees_b0 | selected_storage_containers :2 |
selected_files_to_compact :12 | files_after_compact : 2 | modified_storage_KB :0

On node v_vmart_node0001:
Projection Name :public.employees_b1 | selected_storage_containers :2 |
selected_files_to_compact :12 | files_after_compact : 2 | modified_storage_KB :0

On node v_vmart_node0002:
Projection Name :public.employees_b1 | selected_storage_containers :0 |
selected_files_to_compact :0 | files_after_compact : 0 | modified_storage_KB :0

On node v_vmart_node0003:
Projection Name :public.employees_b1 | selected_storage_containers :1 |
selected_files_to_compact :6 | files_after_compact : 1 | modified_storage_KB :0

Success

(1 row)

5.3 - DUMP_LOCKTABLE

Returns information about deadlocked clients and the resources they are waiting for.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

DUMP_LOCKTABLE()

Privileges

None

Notes

Use DUMP_LOCKTABLE if Vertica becomes unresponsive:

Open an additional vsql connection.
Execute the query:
```
=> SELECT DUMP_LOCKTABLE();
```
The output is written to vsql. See Monitoring the Log Files.

You can also see who is connected using the following command:

=> SELECT * FROM SESSIONS;

Close all sessions using the following command:

=> SELECT CLOSE_ALL_SESSIONS();

Close a single session using the following command:

=> SELECT CLOSE_SESSION('session_id');

You get the session_id value from the V_MONITOR.SESSIONS system table.

5.4 - DUMP_PARTITION_KEYS

Dumps the partition keys of all projections in the system.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

DUMP_PARTITION_KEYS( )

Note

The ROS objects of partitioned tables without partition keys are ignored by the tuple mover and are not merged during automatic tuple mover operations.

Privileges

User must have select privileges on the table or usage privileges on the schema.

Examples

=> SELECT DUMP_PARTITION_KEYS( );
Partition keys on node v_vmart_node0001
  Projection 'states_b0'
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: NH
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: MA
  Projection 'states_b1'
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: VT
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: ME
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: CT

5.5 - GET_CONFIG_PARAMETER

Gets the value of a configuration parameter at the specified level.

Gets the value of a configuration parameter at the specified level. If no value is set at that level, the function returns an empty row.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

GET_CONFIG_PARAMETER( 'parameter-name' [, 'level' | NULL] )

Parameters

parameter-name

Name of the configuration parameter value to get.

level

Level at which to get parameter-name's setting, one of the following string values:

user: Current user
session: Current session
node-name: Name of database node

If level is omitted or set to NULL, GET_CONFIG_PARAMETER returns the database setting.

Privileges

None

Examples

Get the AnalyzeRowCountInterval parameter at the database level:

=> SELECT GET_CONFIG_PARAMETER ('AnalyzeRowCountInterval');
 GET_CONFIG_PARAMETER
----------------------
 3600

Get the MaxSessionUDParameterSize parameter at the session level:

=> SELECT GET_CONFIG_PARAMETER ('MaxSessionUDParameterSize','session');
 GET_CONFIG_PARAMETER
----------------------
 2000
(1 row)

Get the UseDepotForReads parameter at the user level:

=> SELECT GET_CONFIG_PARAMETER ('UseDepotForReads', 'user');
 GET_CONFIG_PARAMETER
----------------------
 1
(1 row)

5.6 - KERBEROS_CONFIG_CHECK

Tests the Kerberos configuration of a Vertica cluster.

Tests the Kerberos configuration of a Vertica cluster. The function succeeds if it can kinit with both the keytab file and the current user's credential, and reports errors otherwise.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

KERBEROS_CONFIG_CHECK( )

Parameters

This function has no parameters.

Privileges

This function does not require privileges.

Examples

The following example shows the results when the Kerberos configuration is valid.

=> SELECT KERBEROS_CONFIG_CHECK();
    kerberos_config_check
-----------------------------------------------------------------------------
 ok: krb5 exists at [/etc/krb5.conf]
 ok: Vertica Keytab file is set to [/etc/vertica.keytab]
 ok: Vertica Keytab file exists at [/etc/vertica.keytab]
[INFO] KerberosCredentialCache [/tmp/vertica_D4/vertica450676899262134963.cc]
 Kerberos configuration parameters set in the database
        KerberosServiceName : [vertica]
        KerberosHostname : [data.hadoop.com]
        KerberosRealm : [EXAMPLE.COM]
        KerberosKeytabFile : [/etc/vertica.keytab]
 Vertica Principal: [vertica/data.hadoop.com@EXAMPLE.COM]
 [OK] Vertica can kinit using keytab file
 [OK] User [bob] has valid client authentication for kerberos principal [bob@EXAMPLE.COM]]

(1 row)

5.7 - MEMORY_TRIM

Calls glibc function malloc_trim() to reclaim free memory from malloc and return it to the operating system.

Calls glibc function malloc_trim() to reclaim free memory from malloc and return it to the operating system. Details on the trim operation are written to system table MEMORY_EVENTS.

Unless you turn off memory polling, Vertica automatically detects when glibc accumulates an excessive amount of free memory in its allocation arena. When this occurs, Vertica consolidates much of this memory and returns it to the operating system. Call this function if you disable memory polling and wish to reduce glibc-allocated memory manually.

For more information, see Memory trimming.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

MEMORY_TRIM()

Privileges

Superuser

Examples

=> SELECT memory_trim();
                           memory_trim
-----------------------------------------------------------------
 Pre-RSS: [378822656] Post-RSS: [372129792] Benefit: [0.0176675]
(1 row)

5.8 - PURGE

Permanently removes delete vectors from ROS storage containers so disk space can be reused.

Permanently removes delete vectors from ROS storage containers so disk space can be reused. PURGE removes all historical data up to and including the Ancient History Mark epoch.

PURGE does not delete temporary tables.

Caution

PURGE can temporarily use significant disk space.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SELECT PURGE()

Privileges

Table owner
USAGE privilege on schema

Examples

After you delete data from a Vertica table, that data is marked for deletion. To see the data that is marked for deletion, query system table DELETE_VECTORS.

Run PURGE to remove the delete vectors from ROS containers.

=> SELECT * FROM test1;
 number
--------
      3
     12
     33
     87
     43
     99
(6 rows)
=> DELETE FROM test1 WHERE number > 50;
 OUTPUT
--------
      2
(1 row)
=> SELECT * FROM test1;
 number
--------
     43
      3
     12
     33
(4 rows)
=> SELECT node_name, projection_name, deleted_row_count FROM DELETE_VECTORS;
    node_name     | projection_name | deleted_row_count
------------------+-----------------+-------------------
 v_vmart_node0002 | test1_b1        |                 1
 v_vmart_node0001 | test1_b1        |                 1
 v_vmart_node0001 | test1_b0        |                 1
 v_vmart_node0003 | test1_b0        |                 1
(4 rows)
=> SELECT PURGE();
...
(Table: public.test1) (Projection: public.test1_b0)
(Table: public.test1) (Projection: public.test1_b1)
...
(4 rows)

After the ancient history mark (AHM) advances:

=> SELECT * FROM DELETE_VECTORS;
 (No rows)

5.9 - RUN_INDEX_TOOL

Runs the Index tool on a Vertica database to perform one of these tasks:.

Runs the Index tool on a Vertica database to perform one of these tasks:

Run a per-block cyclic redundancy check (CRC) on data storage to verify data integrity.
Check that the sort order in ROS containers is correct.

The function writes summary information about its operation to standard output; detailed information on results is logged in vertica.log on the current node. For more about evaluating tool output, see:

You can also run the Index tool on a database that is down, from the Linux command line. For details, see CRC and sort order check.

Caution

Use this function only under guidance from Vertica Support.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

RUN_INDEX_TOOL ( 'taskType', global, '[projFilter]' [, numThreads ] );

Parameters

taskType

Specifies the operation to run, one of the following:

checkcrc: Run a cyclic redundancy check (CRC) on each block of existing data storage to check the data integrity of ROS data blocks.
checksort: Evaluate each ROS row to determine whether it is sorted correctly. If ROS data is not sorted correctly in the projection's order, query results that rely on sorted data will be incorrect.

global

Boolean, specifies whether to run the specified task on all nodes (true), or the current one (false).

projFilter

Specifies the scope of the operation:

Empty string (''): Run the check on all projections.
A string that specifies one or more projections as follows:
- projection-name: Run the check on this projection
- projection-prefix*: Run the check on all projections that begin with the string projection-prefix.

numThreads

An unsigned (positive) or signed (negative) integer that specifies the number of threads used to run this operation:

n: Number of threads, ≥ 1
-n: Negative integer, denotes a fraction of all CPU cores as follows:
```
num-cores / n
```
Thus, -1 specifies all cores, -2, half the cores, -3, a third of all cores, and so on.

Default: 1

Privileges

Superuser

Optimizing performance

You can optimize meta-function performance by setting two parameters:

projFilter: Narrows the scope of the operation to one or more projections.
numThreads: Specifies the number of threads used to execute the function.

5.10 - SECURITY_CONFIG_CHECK

Returns the status of various security-related parameters.

Returns the status of various security-related parameters. Use this function to verify completeness of your TLS configuration.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SECURITY_CONFIG_CHECK( 'db-component' )

Parameters

db-component: The component to check. Currently, NETWORK is the only supported component.
NETWORK: Returns the status and parameters for spread encryption, internode TLS, and client-server TLS.

Examples

In this example, SECURITY_CONFIG_CHECK shows that spread encryption and data channel TLS are disabled because EncryptSpreadComm is disabled and the data_channel TLS Configuration is not configured.

Similarly, client-server TLS is disabled because the TLS Configuration "server" has a server certificate, but its TLSMODE is disabled. Setting TLSMODE to 'Enable' enables server mode client-server TLS. See TLS protocol for details.

=> SELECT SECURITY_CONFIG_CHECK('NETWORK');
                                            SECURITY_CONFIG_CHECK
----------------------------------------------------------------------------------------------------------------------
Spread security details:
* EncryptSpreadComm = []
Spread encryption is disabled
It is NOT safe to set/change other security config parameters while spread is not encrypted!
Please set EncryptSpreadComm to enable spread encryption first

Data Channel security details:
 TLS Configuration 'data_channel' TLSMODE is DISABLE
TLS on the data channel is disabled
Please set EncryptSpreadComm and configure TLS Configuration 'data_channel' to enable TLS on the data channel

Client-Server network security details:
* TLS Configuration 'server' TLSMODE is DISABLE
* TLS Configuration 'server' has a certificate set
Client-Server TLS is disabled
To enable Client-Server TLS set a certificate on TLS Configuration 'server' and/or set the tlsmode to 'ENABLE' or higher

(1 row)

5.11 - SET_CONFIG_PARAMETER

Sets or clears a configuration parameter at the specified level.

Important

You can only use this function to set configuration parameters with string or integer values. To set configuration parameters that accept other data types, use the appropriate ALTER statement.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SET_CONFIG_PARAMETER( 'param-name', { param-value | NULL}, ['level'| NULL])

Arguments

param-name

Name of the configuration parameter to set.

param-value

Value to set for param-name, either a string or integer. If a string, enclose in single quotes; if an integer, single quotes are optional.

To clear param-name at the specified level, set to NULL.

level

Level at which to set param-name, one of the following string values:

user: Current user.
session: Current session, overrides the database setting.
node-name: Name of database node, overrides session and database settings.

If level is omitted or set to NULL, param-name is set at the database level.

Note

Some parameters require restart for the value to take effect.

Privileges

Superuser

Examples

Set the AnalyzeRowCountInterval parameter to 3600 at the database level:

=> SELECT SET_CONFIG_PARAMETER('AnalyzeRowCountInterval',3600);
    SET_CONFIG_PARAMETER
----------------------------
 Parameter set successfully
(1 row)

Note

You can achieve the same result with ALTER DATABASE:

ALTER DATABASE DEFAULT SET PARAMETER AnalyzeRowCountInterval = 3600;

Set the MaxSessionUDParameterSize parameter to 2000 at the session level.

=> SELECT SET_CONFIG_PARAMETER('MaxSessionUDParameterSize',2000,'SESSION');
    SET_CONFIG_PARAMETER
----------------------------
 Parameter set successfully
(1 row)

5.12 - SET_SPREAD_OPTION

Changes daemon settings.

Changes spread daemon settings. This function is mainly used to set the timeout before spread assumes a node has gone down.

Note

Changing Spread settings with SET_SPREAD_OPTION has minor impact on your cluster as it pauses while the new settings are propagated across the cluster. Because of this delay, changes to the Spread timeout are not immediately visible in system table SPREAD_STATE.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SET_SPREAD_OPTION( option-name, option-value )

Parameters

option-name: String containing the spread daemon setting to change.
Currently, this function supports only one option: TokenTimeout. This setting controls how long spread waits for a node to respond to a message before assuming it is lost. See Adjusting Spread Daemon timeouts for virtual environments for more information.
option-value: The new setting for option-name.

Examples

=> SELECT SET_SPREAD_OPTION( 'TokenTimeout', '35000');
NOTICE 9003:  Spread has been notified about the change
                   SET_SPREAD_OPTION
--------------------------------------------------------
 Spread option 'TokenTimeout' has been set to '35000'.

(1 row)

=> SELECT * FROM V_MONITOR.SPREAD_STATE;
    node_name     | token_timeout
------------------+---------------
 v_vmart_node0001 |         35000
 v_vmart_node0002 |         35000
 v_vmart_node0003 |         35000
(3 rows);

5.13 - SHUTDOWN

Shuts down a Vertica database.

Shuts down a Vertica database. By default, the shutdown fails if any users are connected. You can check the status of the shutdown operation in the vertica.log file.

In Eon Mode, you can call SHUTDOWN_WITH_DRAIN to perform a graceful shutdown that drains client connections and then shuts down the database.

Tip

Before calling SHUTDOWN, you can close all current user connections and prevent further connection attempts as follows:

Temporarily set configuration parameter MaxClientSessions to 0.
Call CLOSE_ALL_SESSIONS to close all non-dbamin connections.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SHUTDOWN ( [ 'false' | 'true' ] )

Parameters

false: Default, returns a message if users are connected and aborts the shutdown.
true: Forces the database to shut down, disallowing further connections.

Privileges

Superuser

Examples

The following command attempts to shut down the database. Because users are connected, the command fails:

=> SELECT SHUTDOWN('false');
NOTICE:  Cannot shut down while users are connected
          SHUTDOWN
-----------------------------
 Shutdown: aborting shutdown
(1 row)

6 - Eon Mode functions

The following functions are meant to be used in Eon Mode.

6.1 - ALTER_LOCATION_SIZE

Resizes on one node, all nodes in a subcluster, or all nodes in the database.

Eon Mode only

Resizes the depot on one node, all nodes in a subcluster, or all nodes in the database.

Important

Reducing the size of the depot is liable to increase contention over depot usage and require frequent evictions. This behavior can increase the number of queries and load operations that are routed to communal storage for processing, which can incur slower performance and increased access charges.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Immutable

Syntax

ALTER_LOCATION_SIZE( 'location', '[target]', 'size')

Parameters

location

Specifies the location to resize, one of the following:

depot: Resizes the node's current depot.
The depot's absolute path in the Linux filesystem. If you change the depot size on multiple nodes and specify a path, the path must be identical on all affected nodes . By default, this is not the case, as the node's name is typically this path. For example, the default depot path for node 1 in the verticadb database is /vertica/data/verticadb/v_verticadb_node0001_depot.

target

The node or nodes on which to change the depot, one of the following:

Node name: Resize the specified node.
Subcluster name: Resize depots of all nodes in the specified subcluster.
Empty string: Resize all depots in the database.

size

Valid only if the storage location usage type is set to DEPOT, specifies the maximum amount of disk space that the depot can allocate from the storage location's file system.

You can specify size in two ways:

integer%: Percentage of storage location disk size.
integer{K|M|G|T}: Amount of storage location disk size in kilobytes, megabytes, gigabytes, or terabytes.

Important

The depot size cannot exceed 80 percent of the file system disk space where the depot is stored. If you specify a value that is too large, Vertica issues a warning and automatically changes the value to 80 percent of the file system size.

Privileges

Superuser

Examples

Increase depot size on all nodes to 80 percent of file system:

=> SELECT node_name, location_label, location_path, max_size, disk_percent FROM storage_locations WHERE location_usage = 'DEPOT' ORDER BY node_name;
    node_name     | location_label  |      location_path      |  max_size   | disk_percent
------------------+-----------------+-------------------------+-------------+--------------
 v_vmart_node0001 | auto-data-depot | /home/dbadmin/verticadb | 36060108800 | 70%
 v_vmart_node0002 | auto-data-depot | /home/dbadmin/verticadb | 36059377664 | 70%
 v_vmart_node0003 | auto-data-depot | /home/dbadmin/verticadb | 36060108800 | 70%
(3 rows)

=> SELECT alter_location_size('depot', '','80%');
 alter_location_size
---------------------
 depotSize changed.
(1 row)

=> SELECT node_name, location_label, location_path, max_size, disk_percent FROM storage_locations WHERE location_usage = 'DEPOT' ORDER BY node_name;
    node_name     | location_label  |      location_path      |  max_size   | disk_percent
------------------+-----------------+-------------------------+-------------+--------------
 v_vmart_node0001 | auto-data-depot | /home/dbadmin/verticadb | 41211552768 | 80%
 v_vmart_node0002 | auto-data-depot | /home/dbadmin/verticadb | 41210717184 | 80%
 v_vmart_node0003 | auto-data-depot | /home/dbadmin/verticadb | 41211552768 | 80%
(3 rows)

Change the depot size to 75% of the filesystem size for all nodes in the analytics subcluster:

=> SELECT subcluster_name, subclusters.node_name, storage_locations.max_size, storage_locations.disk_percent FROM subclusters INNER JOIN storage_locations ON subclusters.node_name = storage_locations.node_name WHERE storage_locations.location_usage='DEPOT';
  subcluster_name   |      node_name       |   max_size  | disk_percent
--------------------+----------------------+----------------------------
 default_subcluster | v_verticadb_node0001 | 25264737485 | 60%
 default_subcluster | v_verticadb_node0002 | 25264737485 | 60%
 default_subcluster | v_verticadb_node0003 | 25264737485 | 60%
 analytics          | v_verticadb_node0004 | 25264737485 | 60%
 analytics          | v_verticadb_node0005 | 25264737485 | 60%
 analytics          | v_verticadb_node0006 | 25264737485 | 60%
 analytics          | v_verticadb_node0007 | 25264737485 | 60%
 analytics          | v_verticadb_node0008 | 25264737485 | 60%
 analytics          | v_verticadb_node0009 | 25264737485 | 60%
(9 rows)

=> SELECT ALTER_LOCATION_SIZE('depot','analytics','75%');
 ALTER_LOCATION_SIZE
---------------------
 depotSize changed.
(1 row)

=> SELECT subcluster_name, subclusters.node_name, storage_locations.max_size, storage_locations.disk_percent FROM subclusters INNER JOIN storage_locations ON subclusters.node_name = storage_locations.node_name WHERE storage_locations.location_usage='DEPOT';
  subcluster_name   |      node_name       |   max_size  | disk_percent
--------------------+----------------------+----------------------------
 default_subcluster | v_verticadb_node0001 | 25264737485 | 60%
 default_subcluster | v_verticadb_node0002 | 25264737485 | 60%
 default_subcluster | v_verticadb_node0003 | 25264737485 | 60%
 analytics          | v_verticadb_node0004 | 31580921856 | 75%
 analytics          | v_verticadb_node0005 | 31580921856 | 75%
 analytics          | v_verticadb_node0006 | 31580921856 | 75%
 analytics          | v_verticadb_node0007 | 31580921856 | 75%
 analytics          | v_verticadb_node0008 | 31580921856 | 75%
 analytics          | v_verticadb_node0009 | 31580921856 | 75%
(9 rows)

6.2 - BACKGROUND_DEPOT_WARMING

Vertica version 10.0.0 removes support for foreground depot warming.

Eon Mode only

Deprecated

Vertica version 10.0.0 removes support for foreground depot warming. When enabled, depot warming always happens in the background. Because foreground depot warming no longer exists, this function serves no purpose and has been deprecated. Calling it has no effect.

Forces a node that is warming its depot to start processing queries while continuing to warm its depot in the background. Depot warming only occurs when a node is joining the database and is activating its subscriptions. This function only has an effect if:

The database is running in Eon Mode.
The node is currently warming its depot.
The node is warming its depot from communal storage. This is the case when the UseCommunalStorageForBatchDepotWarming configuration parameter is set to the default value of 1. See Eon Mode parameters for more information about this parameter.

After calling this function, the node warms its depot in the background while taking part in queries.

This function has no effect on a node that is not warming its depot.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

BACKGROUND_DEPOT_WARMING('node-name' [, 'subscription-name'])

Arguments

node-name: The name of the node that you want to warm its depot in the background.
subscription-name: The name of a shard that the node subscribes to that you want the node to warm in the background. You can find the names of the shards a node subscribes to in the SHARD_NAME column of the NODE_SUBSCRIPTIONS system table.

Note
When you supply the name of a specific shard subscription to warm in the background, the node may not immediately begin processing queries. It continues to warm any other shard subscriptions in the foreground if they are not yet warm. The node does not begin taking part in queries until it finishes warming the other subscriptions.

Return value

A message indicating that the node's warming will continue in the background.

Privileges

The user must be a superuser .

Examples

The following example demonstrates having node 6 of the verticadb database warm its depot in the background:


=> SELECT BACKGROUND_DEPOT_WARMING('v_verticadb_node0006');
                          BACKGROUND_DEPOT_WARMING
----------------------------------------------------------------------------
 Depot warming running in background. Check monitoring tables for progress.
(1 row)

6.3 - CANCEL_DEPOT_WARMING

Cancels depot warming on a node.

Eon Mode only

Cancels depot warming on a node. Depot warming only occurs when a node is joining the database and is activating its subscriptions. You can choose to cancel all warming on the node, or cancel the warming of a specific shard's subscription. The node finishes whatever data transfers it is currently carrying out to warm its depot and removes pending warming-related transfers from its queue. It keeps any data it has already loaded into its depot. If you cancel warming for a specific subscription, it stops warming its depot if all of its other subscriptions are warmed. If they aren't warmed, the node continues to warm those other subscriptions.

This function only has an effect if:

The database is running in Eon Mode.
The node is currently warming its depot.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CANCEL_DEPOT_WARMING('node-name' [, 'subscription-name'])

Arguments

'node-name': The name of the node whose depot warming you want canceled.
'subscription-name': The name of a shard that the node subscribes to that you want the node to stop warming. You can find the names of the shards a node subscribes to in the SHARD_NAME column of the NODE_SUBSCRIPTIONS system table.

Return value

Returns a message indicating warming has been canceled.

Privileges

The user must be a superuser.

Usage considerations

Canceling depot warming can negatively impact the performance of your queries. A node with a cold depot may have to retrieve much of its data from communal storage, which is slower than accessing the depot.

Examples

The following demonstrates canceling the depot warming taking place on node 7:


=> SELECT CANCEL_DEPOT_WARMING('v_verticadb_node0007');
   CANCEL_DEPOT_WARMING
--------------------------
 Depot warming cancelled.
(1 row)

6.4 - CANCEL_DRAIN_SUBCLUSTER

Cancels the draining of a subcluster or subclusters.

Eon Mode only

Cancels the draining of a subcluster or subclusters. This function can cancel draining operations that were started by either START_DRAIN_SUBCLUSTER or the draining portion of the SHUTDOWN_WITH_DRAIN function. CANCEL_DRAIN_SUBCLUSTER marks all nodes in the designated subclusters as not draining. The previously draining nodes again accept new client connections and connections redirected from load-balancing.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CANCEL_DRAIN_SUBCLUSTER( 'subcluster-name' )

Arguments

subcluster-name: Name of the subcluster whose draining operation to cancel. Enter an empty string to cancel the draining operation on all subclusters.

Privileges

Superuser

Examples

The following example demonstrates how to cancel a draining operation on a subcluster.

First, you can query the DRAINING_STATUS system table to view which subclusters are currently draining:


=> SELECT node_name, subcluster_name, is_draining FROM draining_status ORDER BY 1;
node_name          | subcluster_name    | is_draining
-------------------+--------------------+-------
verticadb_node0001 | default_subcluster | f
verticadb_node0002 | default_subcluster | f
verticadb_node0003 | default_subcluster | f
verticadb_node0004 | analytics          | t
verticadb_node0005 | analytics          | t
verticadb_node0006 | analytics          | t

The following function call cancels the draining of the analytics subcluster:


=> SELECT CANCEL_DRAIN_SUBCLUSTER('analytics');
             CANCEL_DRAIN_SUBCLUSTER
--------------------------------------------------------
Targeted subcluster: 'analytics'
Action: CANCEL DRAIN

(1 row)

To confirm that the subcluster is no longer draining, you can again query the DRAINING_STATUS system table:


=> SELECT node_name, subcluster_name, is_draining FROM draining_status ORDER BY 1;
node_name          | subcluster_name    | is_draining
-------------------+--------------------+-------
verticadb_node0001 | default_subcluster | f
verticadb_node0002 | default_subcluster | f
verticadb_node0003 | default_subcluster | f
verticadb_node0004 | analytics          | f
verticadb_node0005 | analytics          | f
verticadb_node0006 | analytics          | f
(6 rows)

6.5 - CLEAN_COMMUNAL_STORAGE

Marks for deletion invalid data in communal storage, often data that leaked due to an event where Vertica cleanup mechanisms failed.

Eon Mode only

Marks for deletion invalid data in communal storage, often data that leaked due to an event where Vertica cleanup mechanisms failed. Events that require calling this function include:

Node failure
Interrupted migration of an Enterprise database to Eon
Restoring objects from backup

Tip

It is generally good practice to call CLEAN_COMMUNAL_STORAGE soon after completing an Enterprise-to-Eon migration, and reviving the migrated Eon database.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CLEAN_COMMUNAL_STORAGE ( ['actually-delete'] )

Parameters

actually-delete

BOOLEAN, specifies whether to queue data files for deletion:

true (default): Add files to the reaper queue and return immediately. The queued files are removed automatically by the reaper service, or can be removed manually by calling FLUSH_REAPER_QUEUE.
false: Report information about extra files but do not queue them for deletion.

Privileges

Superuser

Examples

=> SELECT CLEAN_COMMUNAL_STORAGE('true')
CLEAN_COMMUNAL_STORAGE
------------------------------------------------------------------
CLEAN COMMUNAL STORAGE
Task was canceled.
Total leaked files: 9265
Total size: 4236501526
Files have been queued for deletion.
Check communal_cleanup_records for more information.
(1 row)

6.6 - CLEAR_DATA_DEPOT

Deletes the specified depot data.

Eon Mode only

Deletes the specified depot data. You can clear depot data of a single table or all tables, from one subcluster, a single node, or the entire database cluster. Clearing depot data can incur extra processing time for any subsequent queries that require that data and must now fetch it from communal storage. Clearing depot data has no effect on communal storage.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CLEAR_DATA_DEPOT( [ '[table-name]' [, '[target-depots]'] ] )

Arguments

To clear all depot data from the database cluster, call this function with no arguments.

table-name

Name of the table to delete from the target depots. If you omit a table name or supply an empty string, data of all tables is deleted from the target depots.

target-depots

The depots to clear, one of the following:

subcluster-name: Name of the depot subcluster, default_subcluster to specify the default database subcluster.
node-name: Clears depot data from the specified node. Depot data on other nodes in the same subcluster are unaffected.

This argument optionally qualifies the argument for table-name. If you omit this argument or supply an empty string, Vertica clears all depot data from the database cluster.

Privileges

Superuser

Examples

Clear the cached data of one table from the specified subcluster depot:

=> SELECT CLEAR_DATA_DEPOT('t1', 'subcluster_1');
 clear_data_depot
------------------
 Depot cleared
(1 row)

Clear all depot data that is cached on the specified subcluster:.

=> SELECT CLEAR_DATA_DEPOT('', 'subcluster_1');
 clear_data_depot
------------------
 Depot cleared
(1 row)

Clear all depot data that is cached on the specified node:

=> select clear_data_depot('','v_vmart_node0001');
 clear_data_depot
------------------
 Depot cleared
(1 row)

Clear all data of the specified table from the depots of all cluster nodes:

=> SELECT CLEAR_DATA_DEPOT('t1');
 clear_data_depot
------------------
 Depot cleared
(1 row)

Clear all depot data from the database cluster:

=> SELECT CLEAR_DATA_DEPOT();
 clear_data_depot
------------------
 Depot cleared
(1 row)

6.7 - CLEAR_DEPOT_ANTI_PIN_POLICY_PARTITION

Removes an anti-pinning policy from the specified partition.

Eon Mode only

Removes an anti-pinning policy from the specified partition.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CLEAR_DEPOT_ANTI_PIN_POLICY_PARTITION( '[[database.]schema.]object-name', 'min-range-value', 'max-range-value' [, 'subcluster'] )

Arguments

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
object-name: Table or projection with a partition anti-pinning policy to clear.
min-range-value, max-range-value: Range of partition keys in table from which to clear an anti-pinning policy, where min‑range‑value must be ≤ max‑range‑value. To specify a single partition key, min‑range‑value and max‑range‑value must be equal.
subcluster: Name of a depot subcluster, default_subcluster to specify the default database subcluster. If this argument is omitted, all database depots are targeted.

Privileges

Superuser

6.8 - CLEAR_DEPOT_ANTI_PIN_POLICY_PROJECTION

Removes an anti-pinning policy from the specified projection.

Eon Mode only

Removes an anti-pinning policy from the specified projection.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CLEAR_DEPOT_ANTI_PIN_POLICY_PROJECTION( '[[database.]schema.]projection' [, 'subcluster' ] )

Arguments

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
projection: Projection with the anti-pinning policy to clear.
subcluster: Name of a depot subcluster, default_subcluster to specify the default database subcluster. If this argument is omitted, all database depots are targeted.

Privileges

Superuser

6.9 - CLEAR_DEPOT_ANTI_PIN_POLICY_TABLE

Removes an anti-pinning policy from the specified table.

Eon Mode only

Removes an anti-pinning policy from the specified table.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CLEAR_DEPOT_PIN_POLICY_TABLE( '[[database.]schema.]table' [, 'subcluster' ] )

Arguments

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table: Table with the anti-pinning policy to clear.
subcluster: Name of a depot subcluster, default_subcluster to specify the default database subcluster. If this argument is omitted, all database depots are targeted.

Privileges

Superuser

6.10 - CLEAR_DEPOT_PIN_POLICY_PARTITION

Clears a depot pinning policy from the specified table or projection partitions.

Eon Mode only

Clears a depot pinning policy from the specified table or projection partitions. After the object is unpinned, it can be evicted from the depot by any unpinned or pinned object.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CLEAR_DEPOT_PIN_POLICY_PARTITION( '[[database.]schema.]object-name', 'min-range-value', 'max-range-value' [, subcluster ] )

Arguments

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
object-name: Table or projection with a partition pinning policy to clear.
min-range-value, max-range-value: Range of partition keys in table from which to clear a pinning policy, where min‑range‑value must be ≤ max‑range‑value. To specify a single partition key, min‑range‑value and max‑range‑value must be equal.
subcluster: Name of a depot subcluster, default_subcluster to specify the default database subcluster. If this argument is omitted, all database depots are targeted.

Privileges

Superuser

6.11 - CLEAR_DEPOT_PIN_POLICY_PROJECTION

Clears a depot pinning policy from the specified projection.

Eon Mode only

Clears a depot pinning policy from the specified projection. After the object is unpinned, it can be evicted from the depot by any unpinned or pinned object.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CLEAR_DEPOT_PIN_POLICY_PROJECTION( '[[database.]schema.]projection' [, 'subcluster' ] )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
projection: Projection with a pinning policy to clear.
subcluster: Name of a depot subcluster, default_subcluster to specify the default database subcluster. If this argument is omitted, all database depots are targeted.

Privileges

Superuser

6.12 - CLEAR_DEPOT_PIN_POLICY_TABLE

Clears a depot pinning policy from the specified table.

Eon Mode only

Clears a depot pinning policy from the specified table. After the object is unpinned, it can be evicted from the depot by any unpinned or pinned object.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CLEAR_DEPOT_PIN_POLICY_TABLE( '[[database.]schema.]table' [, 'subcluster' ] )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table: Table with a pinning policy to clear.
subcluster: Name of a depot subcluster, default_subcluster to specify the default database subcluster. If this argument is omitted, all database depots are targeted.

Privileges

Superuser

6.13 - CLEAR_FETCH_QUEUE

Removes all entries or entries for a specific transaction from the queue of fetch requests of data from the communal storage.

Eon Mode only

Removes all entries or entries for a specific transaction from the queue of fetch requests of data from the communal storage. You can view the fetch queue by querying the DEPOT_FETCH_QUEUE system table. This function removes all of the queued requests synchronously. It returns after all the fetches have been removed from the queue.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CLEAR_FETCH_QUEUE([transaction_id])

Parameters

*transaction_id*: The id of the transaction whose fetches will be cleared from the queue. If this value is not specified, all fetches are removed from the fetch queue.

Examples

This example clears all of the queued fetches for all transactions.

=> SELECT CLEAR_FETCH_QUEUE();

    CLEAR_FETCH_QUEUE

--------------------------

Cleared the fetch queue.

(1 row)

This example clears the fetch queue for a specific transaction.

=> SELECT node_name,transaction_id FROM depot_fetch_queue;
      node_name       |  transaction_id
----------------------+-------------------
 v_verticadb_node0001 | 45035996273719510
 v_verticadb_node0003 | 45035996273719510
 v_verticadb_node0002 | 45035996273719510
 v_verticadb_node0001 | 45035996273719777
 v_verticadb_node0003 | 45035996273719777
 v_verticadb_node0002 | 45035996273719777

(6 rows)

=> SELECT clear_fetch_queue(45035996273719510);
    clear_fetch_queue
--------------------------
 Cleared the fetch queue.
(1 row)

=> SELECT node_name,transaction_id from depot_fetch_queue;
      node_name       |  transaction_id
----------------------+-------------------
 v_verticadb_node0001 | 45035996273719777
 v_verticadb_node0003 | 45035996273719777
 v_verticadb_node0002 | 45035996273719777

(3 rows)

6.14 - DEMOTE_SUBCLUSTER_TO_SECONDARY

Converts a to a .

Eon Mode only

Converts a primary subcluster to a secondary subcluster.

Vertica will not allow you to demote a primary subcluster if any of the following are true:

The subcluster contains a critical node.
The subcluster is the only primary subcluster in the database. You must have at least one primary subcluster.
The initiator node is a member of the subcluster you are trying to demote. You must call DEMOTE_SUBCLUSTER_TO_SECONDARY from another subcluster.

Important

This function call can take a long time to complete because all the nodes in the subcluster you are promoting or demoting take a global catalog lock, write a checkpoint, and then commit. This global catalog lock can cause other database tasks to fail with errors.

Schedule calls to this function when other database activity is low.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

DEMOTE_SUBCLUSTER_TO_SECONDARY('subcluster-name')

Parameters

subcluster-name: The name of the primary subcluster to demote to a secondary subcluster.

Privileges

Superuser

Examples

The following example demotes the subcluster analytics_cluster to a secondary subcluster:

=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
  subcluster_name  | is_primary
-------------------+------------
 analytics_cluster | t
 load_subcluster   | t
(2 rows)

=> SELECT DEMOTE_SUBCLUSTER_TO_SECONDARY('analytics_cluster');
 DEMOTE_SUBCLUSTER_TO_SECONDARY
--------------------------------
 DEMOTE SUBCLUSTER TO SECONDARY
(1 row)

=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
  subcluster_name  | is_primary
-------------------+------------
 analytics_cluster | f
 load_subcluster   | t
(2 rows)

Attempting to demote the subcluster that contains the initiator node results in an error:

=> SELECT node_name FROM sessions WHERE user_name = 'dbadmin'
   AND client_type = 'vsql';
      node_name
----------------------
 v_verticadb_node0004
(1 row)

=> SELECT node_name, is_primary FROM subclusters WHERE subcluster_name = 'analytics';
      node_name       | is_primary
----------------------+------------
 v_verticadb_node0004 | t
 v_verticadb_node0005 | t
 v_verticadb_node0006 | t
(3 rows)

=> SELECT DEMOTE_SUBCLUSTER_TO_SECONDARY('analytics');
ERROR 9204:  Cannot promote or demote subcluster including the initiator node
HINT:  Run this command on another subcluster

6.15 - FINISH_FETCHING_FILES

Fetches to the depot all files that are queued for download from communal storage.

Eon Mode only

Fetches to the depot all files that are queued for download from communal storage.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

FINISH_FETCHING_FILES()

Privileges

Superuser

Examples

Get all files queued for download:

=> SELECT FINISH_FETCHING_FILES();
      FINISH_FETCHING_FILES
---------------------------------
 Finished fetching all the files
(1 row)

6.16 - FLUSH_REAPER_QUEUE

Deletes all data marked for deletion in the database.

Eon Mode only

Deletes all data marked for deletion in the database. Use this function to remove all data marked for deletion before the reaper service deletes disk files.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

FLUSH_REAPER_QUEUE( [sync-catalog] )

Parameters

*sync-catalog*

Specifies to sync metadata in the database catalog on all nodes before the function executes:

true (default): Sync the database catalog
false: Run without syncing.

Privileges

Superuser

Examples

Remove all files that are marked for deletion:

=> SELECT FLUSH_REAPER_QUEUE();
                 FLUSH_REAPER_QUEUE
-----------------------------------------------------
 Sync'd catalog and deleted all files in the reaper queue.
(1 row)

6.17 - MIGRATE_ENTERPRISE_TO_EON

Migrates an Enterprise database to an Eon Mode database.

Enterprise Mode only

Migrates an Enterprise database to an Eon Mode database. MIGRATE_ENTERPRISE_TO_EON runs in the foreground; until it returns—either with success or an error—it blocks all operations in the same session on the source Enterprise database. If successful, MIGRATE_ENTERPRISE_TO_EON returns with a list of nodes in the migrated database.

If migration is interrupted before the meta-function returns—for example, the client disconnects, or a network outage occurs—the migration returns an error. In this case, call MIGRATE_ENTERPRISE_TO_EON again to restart migration. For details, see Handling Interrupted Migration.

You can repeat migration multiple times to the same communal storage location—for example, to capture changes that occurred in the source database during the previous migration. For details, see Repeating Migration.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

MIGRATE_ENTERPRISE_TO_EON ( 'communal-storage-location', 'depot-location' [, is-dry-run] )

communal-storage-location

URI of communal storage location. For URI syntax examples for each supported schema, see File systems and object stores.

depot-location

Path of Eon depot location, typically:

/vertica/depot

Important

Management Console requires this convention to enable access to depot data and activity.

is-dry-run: Boolean. If set to true, MIGRATE_ENTERPRISE_TO_EON only checks whether the Enterprise source database complies with all migration prerequisites. If the meta-function discovers any compliance issues, it writes these to the migration error log migrate_enterprise_to_eon_error.log in the database directory.
Default: false

Privileges

Superuser

Examples

Migrate an Enterprise database to Eon Mode on AWS:

=> SELECT MIGRATE_ENTERPRISE_TO_EON ('s3://verticadbbucket', '/vertica/depot');
                      migrate_enterprise_to_eon
---------------------------------------------------------------------
 v_vmart_node0001,v_vmart_node0002,v_vmart_node0003,v_vmart_node0004
(1 row)

6.18 - PROMOTE_SUBCLUSTER_TO_PRIMARY

Converts a secondary subcluster to a.

Eon Mode only

Converts a secondary subcluster to a primary subcluster. You cannot use this function to promote the subcluster that contains the initiator node. You must call it while connected to a node in another subcluster.

Important

Schedule calls to this function when other database activity is low.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

PROMOTE_SUBCLUSTER_TO_PRIMARY('subcluster-name')

Parameters

subcluster-name: The name of the secondary cluster to promote to a primary subcluster.

Privileges

Superuser

Examples

The following example promotes the subcluster named analytics_cluster to a primary cluster:

=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
  subcluster_name  | is_primary
-------------------+------------
 analytics_cluster | f
 load_subcluster   | t
(2 rows)


=> SELECT PROMOTE_SUBCLUSTER_TO_PRIMARY('analytics_cluster');
 PROMOTE_SUBCLUSTER_TO_PRIMARY
-------------------------------
 PROMOTE SUBCLUSTER TO PRIMARY
(1 row)


=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
  subcluster_name  | is_primary
-------------------+------------
 analytics_cluster | t
 load_subcluster   | t
(2 rows)

6.19 - REBALANCE_SHARDS

Rebalances shard assignments in a subcluster or across the entire cluster in Eon Mode.

Eon Mode only

Rebalances shard assignments in a subcluster or across the entire cluster in Eon Mode. If the current session ends, the operation immediately aborts. The amount of time required to rebalance shards scales in a roughly linear fashion based on the number of objects in your database.

Run REBALANCE_SHARDS after you modify your cluster using ALTER NODE or when you add nodes to a subcluster.

Note

Vertica rebalances shards in a subcluster automatically when you:

Remove a node from a subcluster.
Add a new subcluster with the admintools command db_add_subcluster with the -s option followed by a list of hosts.

After you rebalance shards, you will no longer be able to restore objects from a backup taken before the rebalancing. (Full backups are always possible.) After you rebalance, make another full backup so you will be able to restore objects from it in the future.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

REBALANCE_SHARDS(['subcluster-name'])

Parameters

subcluster-name: The name of the subcluster where shards will be rebalanced. If you do not supply this parameter, all subclusters in the database rebalance their shards.

Privileges

Superuser

Examples

The following shows that the nodes in the in the newly-added analytics subcluster do not yet have shard subscriptions. It then calls REBALANCE_SHARDS to update the node's subscriptions:

=> SELECT subcluster_name, n.node_name, shard_name, subscription_state FROM
   v_catalog.nodes n LEFT JOIN v_catalog.node_subscriptions ns ON (n.node_name
   = ns.node_name) ORDER BY 1,2,3;

   subcluster_name    |      node_name       | shard_name  | subscription_state
----------------------+----------------------+-------------+--------------------
 analytics_subcluster | v_verticadb_node0004 |             |
 analytics_subcluster | v_verticadb_node0005 |             |
 analytics_subcluster | v_verticadb_node0006 |             |
 default_subcluster   | v_verticadb_node0001 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0001 | segment0001 | ACTIVE
 default_subcluster   | v_verticadb_node0001 | segment0003 | ACTIVE
 default_subcluster   | v_verticadb_node0002 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0002 | segment0001 | ACTIVE
 default_subcluster   | v_verticadb_node0002 | segment0002 | ACTIVE
 default_subcluster   | v_verticadb_node0003 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0003 | segment0002 | ACTIVE
 default_subcluster   | v_verticadb_node0003 | segment0003 | ACTIVE
(12 rows)


=> SELECT REBALANCE_SHARDS('analytics_subcluster');
 REBALANCE_SHARDS
-------------------
 REBALANCED SHARDS
(1 row)

=> SELECT subcluster_name, n.node_name, shard_name, subscription_state FROM
   v_catalog.nodes n LEFT JOIN v_catalog.node_subscriptions ns ON (n.node_name
   = ns.node_name) ORDER BY 1,2,3;

   subcluster_name    |      node_name       | shard_name  | subscription_state
----------------------+----------------------+-------------+--------------------
 analytics_subcluster | v_verticadb_node0004 | replica     | ACTIVE
 analytics_subcluster | v_verticadb_node0004 | segment0001 | ACTIVE
 analytics_subcluster | v_verticadb_node0004 | segment0003 | ACTIVE
 analytics_subcluster | v_verticadb_node0005 | replica     | ACTIVE
 analytics_subcluster | v_verticadb_node0005 | segment0001 | ACTIVE
 analytics_subcluster | v_verticadb_node0005 | segment0002 | ACTIVE
 analytics_subcluster | v_verticadb_node0006 | replica     | ACTIVE
 analytics_subcluster | v_verticadb_node0006 | segment0002 | ACTIVE
 analytics_subcluster | v_verticadb_node0006 | segment0003 | ACTIVE
 default_subcluster   | v_verticadb_node0001 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0001 | segment0001 | ACTIVE
 default_subcluster   | v_verticadb_node0001 | segment0003 | ACTIVE
 default_subcluster   | v_verticadb_node0002 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0002 | segment0001 | ACTIVE
 default_subcluster   | v_verticadb_node0002 | segment0002 | ACTIVE
 default_subcluster   | v_verticadb_node0003 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0003 | segment0002 | ACTIVE
 default_subcluster   | v_verticadb_node0003 | segment0003 | ACTIVE
(18 rows)

6.20 - RESHARD_DATABASE

Changes the number of shards in a database.

Eon Mode only

Changes the number of shards in a database. This function requires a global catalog lock (GCLX) during runtime. The runtime depends on the size of your catalog.

RESHARD_DATABASE does not immediately affect the storage containers in communal storage. After re-sharding, the new shards still point to the existing containers. If you increase the number of shards in your database, multiple shards will point to the same storage containers. Eventually, the Tuple Mover (TM) mergeout tasks will realign the storage containers with the new shard segmentation bounds. If you want the TM to immediately realign storage containers, call DO_TM_TASK to run a 'RESHARDMERGEOUT' task.

This function does not disrupt most queries. However, the global catalog lock might affect data loads and DDL statements.

Important

RESHARD_DATABASE might be rolled back if you call REBALANCE_SHARDS during runtime. In some cases, rollback is caused by down nodes or nodes that fail during the re-shard process.

Syntax

RESHARD_DATABASE(shard-count)

Arguments

shard-count: A positive integer, the number of shards in the re-sharded database. For information about choosing a suitable shard-count, see Choosing the Number of Shards and the Initial Node Count.

Privileges

Superuser

Examples

See Change the number of shards in the database.

6.21 - SANDBOX_SUBCLUSTER

Creates a sandbox for a secondary subcluster.

Note

Vertica recommends using the admintools sandbox_subcluster command to create sandboxes. This command includes additional sanity checks and validates that the sandboxed nodes are UP after sandbox creation. However, you must use the SANDBOX_SUBCLUSTER function to add additional subclusters to an existing sandbox.

If sandboxing the first subcluster in a sandbox, the nodes in the specified subcluster create a checkpoint of the catalog at function runtime. When these nodes auto-restart in the sandbox cluster, they form a primary subcluster that uses the data and catalog checkpoint from the main cluster. After the nodes successfully restart, the sandbox cluster and the main cluster are mutually isolated and can diverge.

While the nodes in the main cluster sync their metadata to /path-to-communal-storage/metadata/db_name, the nodes in the sandbox sync to /path-to-communal-storage/metadata/sandbox_name.

You can perform standard database operations and queries, such as loading data or creating new tables, in either cluster without affecting the other cluster. For example, dropping a table in the sandbox cluster does not drop the table in the main cluster, and vice versa.

Because both clusters reference the same data files, neither cluster can delete files that existed at the time of sandbox creation. However, files that are created in the sandbox can be removed. Files in the main cluster can be queued for removal, but they are not processed until all active sandboxes are removed.

You cannot nest sandboxes, but you can have more than one subcluster in a sandbox and multiple sandboxes active at the same time. To add an additional secondary subcluster to an existing sandbox, you must first call SANDBOX_SUBCLUSTER in the sandbox cluster and then in the main cluster. For details, see Adding subclusters to existing sandboxes.

This is a meta-function. You must call-meta-functions in a top-level SELECT statement. The function also requires a global catalog lock (GCLX) during runtime.

Behavior type

Volatile

Syntax

SANDBOX_SUBCLUSTER( 'sandbox-name', 'subcluster-name', 'options' )

Arguments

sandbox-name

Name of the sandbox. The name must conform to the following rules:

Consist of at most 30 characters, all of which must have an ASCII code between 36 and 126
Begin with a letter
Unique among all existing databases and sandboxes

subcluster-name

Name of the secondary subcluster to sandbox. Attempting to sandbox a primary subcluster or a subcluster that is already sandboxed results in an error. The nodes in the subcluster must all have a status of UP and provide full subscription coverage for all shards.

options

Currently, there are no options for this function.

Privileges

Superuser

Examples

The following example sandboxes the sc02 secondary subcluster into a sandbox named sand:

=> SELECT SANDBOX_SUBCLUSTER('sand', 'sc_02', '');
                                SANDBOX_SUBCLUSTER
-----------------------------------------------------------------------------------------------
Subcluster 'sc_02' has been sandboxed to 'sand'. It is going to auto-restart and re-form.
(1 row)

If you query the NODES system table from the main cluster, you can see that the nodes of sc_02 have a status of UNKNOWN and are listed as member of the sand sandbox:

=> SELECT node_name, subcluster_name, node_state, sandbox FROM NODES;
      node_name       |  subcluster_name   | node_state | sandbox
----------------------+--------------------+------------+---------
 v_verticadb_node0001 | default_subcluster | UP         |
 v_verticadb_node0002 | default_subcluster | UP         |
 v_verticadb_node0003 | default_subcluster | UP         |
 v_verticadb_node0004 | sc_02              | UNKNOWN    | sand
 v_verticadb_node0005 | sc_02              | UNKNOWN    | sand
 v_verticadb_node0006 | sc_02              | UNKNOWN    | sand
(6 rows)

When you issue the same query on one of the sandboxed nodes, the table shows that the sandboxed nodes are UP and the nodes from the main cluster are UNKNOWN, confirming that the cluster is successfully sandboxed:

=> SELECT node_name, subcluster_name, node_state, sandbox FROM NODES;
      node_name       |  subcluster_name   | node_state | sandbox
----------------------+--------------------+------------+---------
 v_verticadb_node0001 | default_subcluster | UNKNOWN    |
 v_verticadb_node0002 | default_subcluster | UNKNOWN    |
 v_verticadb_node0003 | default_subcluster | UNKNOWN    |
 v_verticadb_node0004 | sc_02              | UP         | sand
 v_verticadb_node0005 | sc_02              | UP         | sand
 v_verticadb_node0006 | sc_02              | UP         | sand
(6 rows)

You can now perform standard database operations in either cluster without impacting the other cluster. For instance, if you create a machine learning dataset named train_data in the sandboxed subcluster, the new table does not propagate to the main cluster:

--In the sandboxed subcluster
=> CREATE TABLE train_data(time timestamp, Temperature float);
CREATE TABLE

=> COPY train_data FROM LOCAL 'daily-min-temperatures.csv' DELIMITER ',';
 Rows Loaded
-------------
    3650
(1 row)

=> SELECT * FROM train_data LIMIT 5;
        time         | Temperature
---------------------+-------------
 1981-01-27 00:00:00 |        19.4
 1981-02-20 00:00:00 |        15.7
 1981-02-27 00:00:00 |        17.5
 1981-03-04 00:00:00 |          16
 1981-04-24 00:00:00 |        11.5
(5 rows)

--In the main cluster
=> SELECT * FROM train_data LIMIT 5;
ERROR 4566:  Relation "train_data" does not exist

6.22 - SET_DEPOT_ANTI_PIN_POLICY_PARTITION

Assigns the highest depot eviction priority to a partition.

Eon Mode only

Assigns the highest depot eviction priority to a partition. Among other depot-cached objects, objects with an anti-pinning policy are the most susceptible to eviction from the depot. After eviction, the object must be read directly from communal storage the next time it is needed.

If the table has another partition-level eviction policy already set on it, then Vertica combines the policies based on policy type.

If you alter or remove table partitioning, Vertica automatically clears all eviction policies previously set on partitions of that table. The table's eviction policy, if any, is unaffected.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SET_DEPOT_ANTI_PIN_POLICY_PARTITION (
   '[[database.]schema.]object-name', 'min-range-value', 'max-range-value' [, 'subcluster' ] )

Arguments

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
object-name: Target of this policy.
min-range-value, max-range-value: Minimum and maximum value of partition keys in object-name to anti-pin, where min‑range‑value must be ≤ max‑range‑value. To specify a single partition key, min‑range‑value and max‑range‑value must be equal.
If the new policy's partition key range overlaps the range of an existing partition-level eviction policy, Vertica gives precedence to the new policy, as described in Overlapping Policies.
subcluster: Name of a depot subcluster, default_subcluster to specify the default database subcluster. If this argument is omitted, all database depots are targeted.

Privileges

Superuser

6.23 - SET_DEPOT_ANTI_PIN_POLICY_PROJECTION

Assigns the highest depot eviction priority to a projection.

Eon Mode only

Assigns the highest depot eviction priority to a projection. Among other depot-cached objects, objects with an anti-pinning policy are the most susceptible to eviction from the depot. After eviction, the object must be read directly from communal storage the next time it is needed.

If the projection has another eviction policy already set on it, the new policy supersedes it.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SET_DEPOT_ANTI_PIN_POLICY_PROJECTION ( '[[database.]schema.]projection' [, 'subcluster' ] )

Arguments

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
projection: Target of this policy.
subcluster: Name of a depot subcluster, default_subcluster to specify the default database subcluster. If this argument is omitted, all database depots are targeted.

Privileges

Superuser

6.24 - SET_DEPOT_ANTI_PIN_POLICY_TABLE

Assigns the highest depot eviction priority to a table.

Eon Mode only

Assigns the highest depot eviction priority to a table. Among other depot-cached objects, objects with an anti-pinning policy are the most susceptible to eviction from the depot. After eviction, the object must be read directly from communal storage the next time it is needed.

If the table has another eviction policy already set on it, the new policy supersedes it.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SET_DEPOT_ANTI_PIN_POLICY_TABLE ( '[[database.]schema.]table' [, 'subcluster' ] )

Arguments

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table: Target of this policy.
subcluster: Name of a depot subcluster, default_subcluster to specify the default database subcluster. If this argument is omitted, all database depots are targeted.

Privileges

Superuser

6.25 - SET_DEPOT_PIN_POLICY_PARTITION

Pins the specified partitions of a table or projection to a subcluster depot, or all database depots, to reduce exposure to depot eviction.

Eon Mode only

Pins the specified partitions of a table or projection to a subcluster depot, or all database depots, to reduce exposure to depot eviction.

If the table has another partition-level eviction policy already set on it, then Vertica combines the policies based on policy type.

If you alter or remove table partitioning, Vertica automatically clears all eviction policies previously set on partitions of that table. The table's eviction policy, if any, is unaffected.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SET_DEPOT_PIN_POLICY_PARTITION (
   '[[database.]schema.]object-name', 'min-range-value', 'max-range-value' [, 'subcluster' ] [, 'download' ] )

Arguments

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
object-name: Table or projection to pin. If you specify a projection, it must store the partition keys.
min-range-value, max-range-value: Minimum and maximum value of partition keys in object-name to pin, where min‑range‑value must be ≤ max‑range‑value. To specify a single partition key, min‑range‑value and max‑range‑value must be equal.
If the new policy's partition key range overlaps the range of an existing partition-level eviction policy, Vertica gives precedence to the new policy, as described in Overlapping Policies below.
subcluster: Name of a depot subcluster, default_subcluster to specify the default database subcluster. If this argument is omitted, all database depots are targeted.
download: Boolean, if true, SET_DEPOT_PIN_POLICY_PARTITION immediately queues the specified partitions for download from communal storage.
Default: false

Privileges

Superuser

Overlapping policies

If a new partition pinning policy overlaps the partition key range of an existing eviction policy, Vertica determines how to apply the policy based on the type of the new and existing policies.

Both policies are pinning policies

If both the new and existing policies are pining policies, then Vertica collates the two ranges. For example, if you create two partition pinning policies with key ranges of 1-3 and 2-10, Vertica creates a single policy with a key range of 1-10.

Partition pinning policy overlaps anti-pinning policy

If the new partition pinning policy overlaps an anti-pinning policy, then Vertica issues a warning and informational message that it reassigned the range of overlapping keys from the anti-pinning policy to the new pinning policy.

For example, if you create an anti-partition pinning policy and then a pinning policy with key ranges of 1-10 and 5-20, respectively, Vertica truncates the earlier anti-pinning policy's key range:

policy_type	min_value	max_value
PIN	5	20
ANTI_PIN	1	4

If the new pinning policy's partition range falls inside the range of an older anti-pinning policy, Vertica splits the anti-pinning policy. So, given an existing partition anti-pinning policy with a key range of 1-20, a new partition pinning policy with a key range of 5-10 splits the anti-pinning policy:

policy_type	min_value	max_value
ANTI_PIN	1	4
PIN	5	10
ANTI_PIN	11	20

Precedence of pinning policies

In general, partition management functions that involve two partitioned tables give precedence to the target table's pinning policy, as follows:

COPY_PARTITIONS_TO_TABLE: Partition-level pinning is reliable if the source and target tables have pinning policies on the same partition keys. If the two tables have different pinning policies, then the partition pinning policies of the target table apply.
MOVE_PARTITIONS_TO_TABLE: Partition-level pinning policies of the target table apply.
SWAP_PARTITIONS_BETWEEN_TABLES: Partition-level pinning policies of the target table apply.

For example, the following statement copies partitions from table t1 to table t2:

=> SELECT COPY_PARTITIONS_TO_TABLE('t1', '1', '5', 't2');

In this case, the following logic applies:

If the two tables have different partition pinning policies, then the pinning policy of target table t2 for partition keys 1-5 applies.
If table t2 does not exist, then Vertica creates it from table t1, and copies t1's policy on partition keys 1-5. Subsequently, if you clear the partition pinning policy from either table, it is also cleared from the other.

6.26 - SET_DEPOT_PIN_POLICY_PROJECTION

Pins a projection to a subcluster depot, or all database depots, to reduce its exposure to depot eviction.

Eon Mode only

Pins a projection to a subcluster depot, or all database depots, to reduce its exposure to depot eviction. For details on pinning policies and usage guidelines, see Pinning Policies.

If the projection has another eviction policy already set on it, the new policy supersedes it.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SET_DEPOT_PIN_POLICY_PROJECTION ( '[[database.]schema.]projection' [, 'subcluster' ] [, download ] )

Arguments

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
projection: Projection to pin.
subcluster: Name of a depot subcluster, default_subcluster to specify the default database subcluster. If this argument is omitted, all database depots are targeted.
download: Boolean, if true SET_DEPOT_PIN_POLICY_PROJECTION immediately queues the specified projection for download from communal storage.
Default: false

Privileges

Superuser

6.27 - SET_DEPOT_PIN_POLICY_TABLE

Pins a table to a subcluster depot, or all database depots, to reduce its exposure to depot eviction.

Eon Mode only

Pins a table to a subcluster depot, or all database depots, to reduce its exposure to depot eviction.

If the table has another eviction policy already set on it, the new policy supersedes it. After you pin a table to a subcluster depot, you cannot subsequently pin any of its partitions and projections in that depot.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SET_DEPOT_PIN_POLICY_TABLE ( '[[database.]schema.]table' [, 'subcluster' ] [, download ] )

Arguments

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table: Table to pin.
subcluster: Name of a depot subcluster, default_subcluster to specify the default database subcluster. If this argument is omitted, all database depots are targeted.
download: Boolean, if true, SET_DEPOT_PIN_POLICY_TABLE immediately queues the specified table for download from communal storage.
Default: false

Privileges

Superuser

6.28 - SHUTDOWN_SUBCLUSTER

Shuts down a subcluster.

Eon Mode only

Shuts down a subcluster. This function shuts down the subcluster synchronously, returning when shutdown is complete with the message Subcluster shutdown. If the subcluster is already down, the function returns with no error.

Stopping a subcluster does not warn you if there are active user sessions connected to the subcluster. This behavior is the same as stopping an individual node. Before stopping a subcluster, verify that no users are connected to it.

If you want to drain client connections before shutting down a subcluster, you can gracefully shutdown the subcluster using SHUTDOWN_WITH_DRAIN.

Caution

This function does not test whether the target subcluster is critical (a subcluster whose loss would cause the database to shut down). Using this function to shut down a critical subcluster results in the database shutting down. Always verify that the subcluster you want to shut down is not critical by querying the CRITICAL_SUBCLUSTERS system table before calling this function.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SHUTDOWN_SUBCLUSTER('subcluster-name')

Arguments

subcluster-name: Name of the subcluster to shut down.

Privileges

Superuser

Examples

The following example demonstrates shutting down the subcluster analytics:

=> SELECT subcluster_name, node_name, node_state FROM nodes order by 1,2;
  subcluster_name   |      node_name       | node_state
--------------------+----------------------+------------
 analytics          | v_verticadb_node0004 | UP
 analytics          | v_verticadb_node0005 | UP
 analytics          | v_verticadb_node0006 | UP
 default_subcluster | v_verticadb_node0001 | UP
 default_subcluster | v_verticadb_node0002 | UP
 default_subcluster | v_verticadb_node0003 | UP
(6 rows)

=> SELECT SHUTDOWN_SUBCLUSTER('analytics');
WARNING 4539:  Received no response from v_verticadb_node0004 in stop subcluster
WARNING 4539:  Received no response from v_verticadb_node0005 in stop subcluster
WARNING 4539:  Received no response from v_verticadb_node0006 in stop subcluster
 SHUTDOWN_SUBCLUSTER
---------------------
 Subcluster shutdown
(1 row)

=> SELECT subcluster_name, node_name, node_state FROM nodes order by 1,2;
  subcluster_name   |      node_name       | node_state
--------------------+----------------------+------------
 analytics          | v_verticadb_node0004 | DOWN
 analytics          | v_verticadb_node0005 | DOWN
 analytics          | v_verticadb_node0006 | DOWN
 default_subcluster | v_verticadb_node0001 | UP
 default_subcluster | v_verticadb_node0002 | UP
 default_subcluster | v_verticadb_node0003 | UP
(6 rows)

Note

The "WARNING 4539" messages after calling SHUTDOWN_SUBCLUSTER occur because the nodes are in the process of shutting down. They are expected.

6.29 - SHUTDOWN_WITH_DRAIN

Gracefully shuts down a subcluster or subclusters.

Eon Mode only

Gracefully shuts down a subcluster or subclusters. The function drains client connections on the subcluster's nodes and then shuts down the subcluster. This is synchronous function that returns when the shutdown message has been sent to the subcluster.

Work from existing user sessions continues on draining nodes, but the nodes refuse new client connections and are excluded from load-balancing operations. dbadmin can still connect to draining nodes.

The nodes drain until either the existing connections complete their work and close or the user-specified timeout is reached. When one of these conditions is met, the function proceeds to shut down the subcluster.

For more information about the graceful shutdown process, see Graceful Shutdown.

Caution

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SHUTDOWN_WITH_DRAIN( 'subcluster-name', timeout-seconds )

Arguments

subcluster-name

Name of the subcluster to shutdown. Enter an empty string to shutdown all subclusters in a database.

timeout-seconds

Number of seconds to wait before forcefully closing subcluster-name's client connections and shutting down. The behavior depends on the sign of timeout-seconds:

Positive integer: The function waits until either the runtime reaches timeout-seconds or the client connections finish their work and close. As soon as one of these conditions is met, the function immediately proceeds to shut down the subcluster.
Zero: The function immediately closes any open client connections and shuts down the subcluster.
Negative integer: The function marks the subcluster as draining and waits indefinitely to shut down the subcluster until all active user sessions disconnect.

Privileges

Superuser

Examples

In the following example, the function marks the subcluster named analytics as draining and then shuts it down as soon as either the existing client connections close or 300 seconds pass:

=> SELECT SHUTDOWN_WITH_DRAIN('analytics', 120);
NOTICE 0:  Draining has started on subcluster (analytics)
NOTICE 0:  Begin shutdown of subcluster (analytics)
                              SHUTDOWN_WITH_DRAIN
--------------------------------------------------------------------------------------------------------------------
Set subcluster (analytics) to draining state
Waited for 3 nodes to drain
Shutdown message sent to subcluster (analytics)

(1 row)

You can query the DC_DRAINING_EVENTS table to see more information about draining and shutdown events, such as whether any user sessions were forcibly closed. This subcluster had one active user session when the shutdown began, but it closed before the timeout was reached:

=> SELECT event_type, event_type_name, event_description, event_result, event_result_name FROM dc_draining_events;
 event_type |       event_type_name        |                          event_description                          | event_result | event_result_name
------------+------------------------------+---------------------------------------------------------------------+--------------+-------------------
          0 | START_DRAIN_SUBCLUSTER       | START_DRAIN for SHUTDOWN of subcluster (analytics)                  |            0 | SUCCESS
          2 | START_WAIT_FOR_NODE_DRAIN    | Wait timeout is 120 seconds                                         |            4 | INFORMATIONAL
          4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 0 seconds                                   |            4 | INFORMATIONAL
          4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 30 seconds                                  |            4 | INFORMATIONAL
          3 | END_WAIT_FOR_NODE_DRAIN      | Wait for drain ended with 0 sessions remaining                      |            0 | SUCCESS
          5 | BEGIN_SHUTDOWN_AFTER_DRAIN   | Starting shutdown of subcluster (analytics) following drain         |            4 | INFORMATIONAL
(6 rows)

6.30 - START_DRAIN_SUBCLUSTER

Drains a subcluster or subclusters.

Eon Mode only

Drains a subcluster or subclusters. The function marks all nodes in the designated subcluster as draining. Work from existing user sessions continues on draining nodes, but the nodes refuse new client connections and are excluded from load balancing operations. dbadmin can still connect to draining nodes.

To drain connections on a subcluster as part of a graceful shutdown process, you can call SHUTDOWN_WITH_DRAIN. For details, see Graceful Shutdown.

To cancel a draining operation on a subcluster, call CANCEL_DRAIN_SUBCLUSTER. If all draining nodes in a subcluster are stopped, they are marked as not draining upon restart.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

START_DRAIN_SUBCLUSTER( 'subcluster-name' )

Arguments

subcluster-name: Name of the subcluster to drain. Enter an empty string to drain all subclusters in the database.

Privileges

Superuser

Examples

The following example demonstrates how to drain a subcluster named analytics:

=> SELECT subcluster_name, node_name, node_state FROM nodes;
subcluster_name    |      node_name     | node_state
-------------------+--------------------+------------
default_subcluster | verticadb_node0001 | UP
default_subcluster | verticadb_node0002 | UP
default_subcluster | verticadb_node0003 | UP
analytics          | verticadb_node0004 | UP
analytics          | verticadb_node0005 | UP
analytics          | verticadb_node0006 | UP
(6 rows)

=> SELECT START_DRAIN_SUBCLUSTER('analytics');
               START_DRAIN_SUBCLUSTER
-------------------------------------------------------
Targeted subcluster: 'analytics'
Action: START DRAIN
(1 row)

You can confirm that the subcluster is draining by querying the DRAINING_STATUS system table:


=> SELECT node_name, subcluster_name, is_draining FROM draining_status ORDER BY 1;
node_name          | subcluster_name    | is_draining
-------------------+--------------------+-------
verticadb_node0001 | default_subcluster | f
verticadb_node0002 | default_subcluster | f
verticadb_node0003 | default_subcluster | f
verticadb_node0004 | analytics          | t
verticadb_node0005 | analytics          | t
verticadb_node0006 | analytics          | t

6.31 - START_REAPING_FILES

Starts the disk file deletion in the background as an asynchronous function.

Eon Mode only

Starts the disk file deletion in the background as an asynchronous function. By default, this meta-function syncs the catalog before beginning deletion. Disk file deletion is handled in the foreground by FLUSH_REAPER_QUEUE.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

START_REAPING_FILES( [sync-catalog] )

Parameters

*sync-catalog*

Specifies to sync metadata in the database catalog on all nodes before the function executes:

true (default): Sync the database catalog
false: Run without syncing.

Privileges

Superuser

Examples

Start the reaper service:

=> SELECT START_REAPING_FILES();

Start the reaper service and skip the initial catalog sync:

=> SELECT START_REAPING_FILES(false);

6.32 - SYNC_CATALOG

Synchronizes the catalog to communal storage to enable reviving the current catalog version in the case of an imminent crash.

Eon Mode only

Synchronizes the catalog to communal storage to enable reviving the current catalog version in the case of an imminent crash. Vertica synchronizes all pending checkpoint and transaction logs to communal storage.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SYNC_CATALOG( [ 'node-name' ] )

Parameters

node-name: The node to synchronize. If you omit this argument, Vertica synchronizes the catalog on all nodes.

Privileges

Superuser

Examples

Synchronize the catalog on all nodes:

=> SELECT SYNC_CATALOG();

Synchronize the catalog on one node:

=> SELECT SYNC_CATALOG( 'node001' );

6.33 - UNSANDBOX_SUBCLUSTER

Removes a subcluster from a sandbox.

Note

Vertica recommends using the admintools unsandbox_subcluster command to remove the sandbox's primary subcluster. This command automatically stops the sandboxed nodes, wipes the node's catalog subdirectories, and restarts the nodes. If you use the UNSANDBOX_SUBCLUSTER function, these steps must be completed manually.

After stopping the nodes in the sandboxed subcluster, you must run this function in the main cluster from which the sandboxed subcluster was spun-off. The function changes the metadata in the main cluster that designates the specified subcluster as sandboxed, but does not restart the subcluster and rejoin it to the main cluster.

If you are unsandboxing a secondary subcluster from the sandbox, Vertica recommends that you also call the UNSANDBOX_SUBCLUSTER function in the sandbox cluster. This makes sure that both clusters are aware of the state of the subcluster and that relevant system tables accurately reflect the subcluster's status.

To rejoin the subcluster to the main cluster and return the nodes to their normal state, you must complete the following tasks:

Wipe the catalog subdirectory from the sandboxed nodes. The main cluster provides the current catalog information on node restart.
Restart the nodes. On successful restart, the nodes should rejoin the main cluster.
If unsandboxing the last subcluster in a sandbox, remove the sandbox metadata prefix from the shared communal storage location. This helps avoid problems that might arise form reusing the same sandbox name.

Note

If you upgraded the Vertica version of the sandboxed subcluster, you must downgrade the version of the subcluster before rejoining it to the main cluster.

If there are no more active sandboxes, you can run CLEAN_COMMUNAL_STORAGE to remove any data created in the sandbox. The main cluster can also resume processing data queued for deletion.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

UNSANDBOX_SUBCLUSTER( 'subcluster-name', 'options' )

Arguments

subcluster-name: Identifies the subcluster to unsandbox. This must be a currently sandboxed subcluster.
options: Currently, there are no options for this function.

Privileges

Superuser

Examples

In the following example, the function unsandboxes the sc_02 secondary subcluster from the sand sandbox. After stopping the nodes in the subcluster, you can unsandbox the subcluster by calling the UNSANDBOX_SUBCLUSTER function from the main cluster:

=> SELECT UNSANDBOX_SUBCLUSTER('sc_02', '');
                                          UNSANDBOX_SUBCLUSTER
---------------------------------------------------------------------------------------------------------------
Subcluster 'sc_02' has been unsandboxed. If wiped out and restarted, it should be able to rejoin the cluster.
(1 row)

To rejoin the nodes to the main cluster, you must wipe the local catalog from each of the previously sandboxed nodes—whose catalog location can be found by querying NODES—and then restart the nodes:


$ rm -rf paths-to-node-catalogs

$ admintools -t restart_node -s list-of-nodes -p password

After the nodes restart, you can query the NODES system table to confirm that the previously sandboxed nodes are UP and are no longer a member of sand:

=> SELECT node_name, subcluster_name, node_state, sandbox FROM NODES;
      node_name       |  subcluster_name   | node_state | sandbox
----------------------+--------------------+------------+---------
 v_verticadb_node0001 | default_subcluster | UP         |
 v_verticadb_node0002 | default_subcluster | UP         |
 v_verticadb_node0003 | default_subcluster | UP         |
 v_verticadb_node0004 | sc_01              | UNKNOWN    | sand
 v_verticadb_node0005 | sc_01              | UNKNOWN    | sand
 v_verticadb_node0006 | sc_01              | UNKNOWN    | sand
 v_verticadb_node0007 | sc_02              | UP         | 
 v_verticadb_node0008 | sc_02              | UP         | 
 v_verticadb_node0009 | sc_02              | UP         | 
(9 rows)

Because sc_02 was a secondary subcluster in the sandbox, you should also call the UNSANDBOX_SUBCLUSTER function in the sandbox cluster. This makes sure that both clusters are aware of the state of the subcluster and that relevant system tables accurately reflect the subcluster's status:

=> SELECT UNSANDBOX_SUBCLUSTER('sc_02', '');
                                               UNSANDBOX_SUBCLUSTER                                                
-------------------------------------------------------------------------------------------------------------------
 Subcluster 'sc_02' has been unsandboxed from 'sand'. This command should be executed in the main cluster as well.
(1 row)

If there are no more active sandboxes, you can run the CLEAN_COMMUNAL_STORAGE function to remove any data created in the sandbox. You should also remove the sandbox's metadata from the shared communal storage location, which can be found at /path-to-communal-storage/metadata/sandbox_name.

The following example removes the sandbox's metadata from an S3 bucket and then calls CLEAN_COMMUNAL_STORAGE to cleanup any data from the sandbox:


$ aws s3 rm /path-to-communal/metadata/sandbox_name

=> SELECT CLEAN_COMMUNAL_STORAGE('true');
                     CLEAN_COMMUNAL_STORAGE
-----------------------------------------------------------------
 CLEAN COMMUNAL STORAGE
Total leaked files: 143
Files have been queued for deletion.
Check communal_cleanup_records for more information.

(1 row)

7 - Epoch functions

This section contains the epoch management functions specific to Vertica.

7.1 - ADVANCE_EPOCH

Manually closes the current epoch and begins a new epoch.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

ADVANCE_EPOCH ( [ integer ] )

Parameters

integer: Specifies the number of epochs to advance.

Privileges

Superuser

Notes

This function is primarily maintained for backward compatibility with earlier versions of Vertica.

Examples

The following command increments the epoch number by 1:

=> SELECT ADVANCE_EPOCH(1);

7.2 - GET_AHM_EPOCH

Returns the number of the in which the is located.

Returns the number of the epoch in which the Ancient History Mark is located. Data deleted up to and including the AHM epoch can be purged from physical storage.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

GET_AHM_EPOCH()

Note

The AHM epoch is 0 (zero) by default (purge is disabled).

Privileges

None

Examples

=> SELECT GET_AHM_EPOCH();
    GET_AHM_EPOCH
----------------------
 Current AHM epoch: 0
(1 row)

7.3 - GET_AHM_TIME

Returns a TIMESTAMP value representing the.

Returns a TIMESTAMP value representing the Ancient History Mark. Data deleted up to and including the AHM epoch can be purged from physical storage.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

GET_AHM_TIME()

Privileges

None

Examples

=> SELECT GET_AHM_TIME();
                  GET_AHM_TIME
-------------------------------------------------
 Current AHM Time: 2010-05-13 12:48:10.532332-04
(1 row)

7.4 - GET_CURRENT_EPOCH

Returns the number of the current epoch.

The epoch into which data (COPY, INSERT, UPDATE, and DELETE operations) is currently being written.

Returns the number of the current epoch.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

GET_CURRENT_EPOCH()

Privileges

None

Examples

=> SELECT GET_CURRENT_EPOCH();
 GET_CURRENT_EPOCH
-------------------
               683
(1 row)

7.5 - GET_LAST_GOOD_EPOCH

Returns the number.

Returns the last good epoch number. If the database has no projections, the function returns an error.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

GET_LAST_GOOD_EPOCH()

Privileges

None

Examples

=> SELECT GET_LAST_GOOD_EPOCH();
 GET_LAST_GOOD_EPOCH
---------------------
                 682
(1 row)

7.6 - MAKE_AHM_NOW

Sets the (AHM) to the greatest allowable value.

Sets the Ancient History Mark (AHM) to the greatest allowable value. This lets you purge all deleted data.

Caution

After running this function, you cannot query historical data that precedes the current epoch. Only database administrators should use this function.

MAKE_AHM_NOW performs the following operations:

Advances the epoch.
Sets the AHM to the last good epoch (LGE) — at least to the epoch that is current when you execute MAKE_AHM_NOW.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

MAKE_AHM_NOW ( [ true ] )

Parameters

true

Allows AHM to advance when one of the following conditions is true:

One or more nodes are down.
One projection is being refreshed from another (retentive refresh).

In both cases , you must supply this argument to MAKE_AHM_NOW, otherwise Vertica returns an error. If you execute MAKE_AHM_NOW(true) during retentive refresh, Vertica rolls back the refresh operation and advances the AHM.

Caution

If the function advances AHM beyond the last good epoch of the down nodes, those nodes must recover all data from scratch.

Privileges

Superuser

Setting AHM when nodes are down

If any node in the cluster is down, you must call MAKE_AHM_NOW with an argument of true; otherwise, the function returns an error.

Note

This requirement applies only to Enterprise mode; in Eon mode, it is ignored.

In the following example, MAKE_AHM_NOW advances the AHM even though a node is down:

=> SELECT MAKE_AHM_NOW(true);
WARNING:  Received no response from v_vmartdb_node0002 in get cluster LGE
WARNING:  Received no response from v_vmartdb_node0002 in get cluster LGE
WARNING:  Received no response from v_vmartdb_node0002 in set AHM
         MAKE_AHM_NOW
------------------------------
 AHM set (New AHM Epoch: 684)
(1 row)

7.7 - SET_AHM_EPOCH

Sets the (AHM) to the specified epoch.

Sets the Ancient History Mark (AHM) to the specified epoch. This function allows deleted data up to and including the AHM epoch to be purged from physical storage.

SET_AHM_EPOCH is normally used for testing purposes. Instead, consider using SET_AHM_TIME which is easier to use.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SET_AHM_EPOCH ( epoch, [ true ] )

Parameters

epoch

Specifies one of the following:

The number of the epoch in which to set the AHM
Zero (0) (the default) disables PURGE

Important

The number of the specified epoch must be:

Greater than the current AHM epoch
Less than the current epoch

Query the SYSTEM table to view current epoch values relative to the AHM.

true

Allows the AHM to advance when nodes are down.

Caution

If you advance AHM beyond the last good epoch of the down nodes, those nodes must recover all data from scratch.

Privileges

Superuser

Setting AHM when nodes are down

If any node in the cluster is down, you must call SET_AHM_EPOCH with an argument of true; otherwise, the function returns an error.

Note

This requirement applies only to Enterprise mode; in Eon mode, it is ignored.

Examples

The following command sets the AHM to a specified epoch of 12:

=> SELECT SET_AHM_EPOCH(12);

The following command sets the AHM to a specified epoch of 2 and allows the AHM to advance despite a failed node:

=> SELECT SET_AHM_EPOCH(2, true);

7.8 - SET_AHM_TIME

Sets the (AHM) to the epoch corresponding to the specified time on the initiator node.

Sets the Ancient History Mark (AHM) to the epoch corresponding to the specified time on the initiator node. This function allows historical data up to and including the AHM epoch to be purged from physical storage. SET_AHM_TIME returns a TIMESTAMPTZ that represents the end point of the AHM epoch.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SET_AHM_TIME ( time, [ true ] )

Parameters

time: A TIMESTAMP/TIMESTAMPTZ value that is automatically converted to the appropriate epoch number.
true: Allows the AHM to advance when nodes are down.

Caution
If you advance AHM beyond the last good epoch of the down nodes, those nodes must recover all data from scratch.

Privileges

Superuser

Setting AHM when nodes are down

If any node in the cluster is down, you must call SET_AHM_TIME with an argument of true; otherwise, the function returns an error.

Note

This requirement applies only to Enterprise mode; in Eon mode, it is ignored.

Examples

Epochs depend on a configured epoch advancement interval. If an epoch includes a three-minute range of time, the purge operation is accurate only to within minus three minutes of the specified timestamp:

=> SELECT SET_AHM_TIME('2008-02-27 18:13');
           set_ahm_time
------------------------------------
 AHM set to '2008-02-27 18:11:50-05'
(1 row)

Note

The –05 part of the output string is a time zone value, an offset in hours from UTC (Universal Coordinated Time, traditionally known as Greenwich Mean Time, or GMT).

In the previous example, the actual AHM epoch ends at 18:11:50, roughly one minute before the specified timestamp. This is because SET_AHM_TIME selects the epoch that ends at or before the specified timestamp. It does not select the epoch that ends after the specified timestamp because that would purge data deleted as much as three minutes after the AHM.

For example, using only hours and minutes, suppose that epoch 9000 runs from 08:50 to 11:50 and epoch 9001 runs from 11:50 to 15:50. SET_AHM_TIME('11:51') chooses epoch 9000 because it ends roughly one minute before the specified timestamp.

In the next example, suppose that a node went down at 11:00:00 AM on January 1st 2017. At noon, you want to advance the AHM to 11:15:00, but the node is still down.

Suppose you try to set the AHM using this command:

=> SELECT SET_AHM_TIME('2017-01-01 11:15:00');

Then you will receive an error message. Vertica prevents you from moving the AHM past the point where a node went down. Vertica returns this error to prevent the AHM from advancing past the down node's last good epoch. You can force the AHM to advance by supplying the optional second parameter:

=> SELECT SET_AHM_TIME('2017-01-01 11:15:00', true);

However, if you force the AHM past the last good epoch, the failed node will have to recover from scratch.

8 - LDAP link functions

This section contains the functions associated with the Vertica LDAP Link service.

8.1 - LDAP_LINK_DRYRUN_CONNECT

Takes a set of LDAP Link connection parameters as arguments and begins a dry run connection between the LDAP server and Vertica.

By providing an empty string for the LDAPLinkBindPswd argument, you can also perform an anonymous bind if your LDAP server allows unauthenticated binds.

The dryrun and LDAP_LINK_SYNC_START functions must be run from the clerk node. To determine the clerk node, query NODE_RESOURCES:

=> SELECT node_name, dbclerk FROM node_resources WHERE dbclerk='t';
    node_name     | dbclerk
------------------+---------
 v_vmart_node0001 | t
(1 row)

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

LDAP_LINK_DRYRUN_CONNECT (
    'LDAPLinkURL',
    'LDAPLinkBindDN',
    'LDAPLinkBindPswd'
)

Privileges

Superuser

Examples

This tests the connection to an LDAP server at ldap://example.dc.com with the DN CN=amir,OU=QA,DC=dc,DC=com.

=> SELECT LDAP_LINK_DRYRUN_CONNECT('ldap://example.dc.com','CN=amir,OU=QA,DC=dc,DC=com','password');

                ldap_link_dryrun_connect
---------------------------------------------------------------------------------
Dry Run Connect Completed. Query v_monitor.ldap_link_dryrun_events for results.

To check the results of the bind, query the system table LDAP_LINK_DRYRUN_EVENTS.

=> SELECT event_timestamp, event_type, entry_name, role_name, link_scope, search_base from LDAP_LINK_DRYRUN_EVENTS;
        event_timestamp       |       event_type      |      entry_name      | link_scope | search_base
------------------------------+-----------------------+----------------------+------------+-------------
2019-12-09 15:41:43.589398-05 | BIND_STARTED          | -------------------- | ---------- | -----------
2019-12-09 15:41:43.590504-05 | BIND_FINISHED         | -------------------- | ---------- | -----------

8.2 - LDAP_LINK_DRYRUN_SEARCH

Takes a set of LDAP Link connection and search parameters as arguments and begins a dry run search for users and groups that would get imported from the LDAP server.

By providing an empty string for the LDAPLinkBindPswd argument, you can also perform an anonymous search if your LDAP server's Access Control List (ACL) is configured to allow unauthenticated searches. The settings for allowing anonymous binds are different from the ACL settings for allowing anonymous searches.

The dryrun and LDAP_LINK_SYNC_START functions must be run from the clerk node. To determine the clerk node, query NODE_RESOURCES:

=> SELECT node_name, dbclerk FROM node_resources WHERE dbclerk='t';
    node_name     | dbclerk
------------------+---------
 v_vmart_node0001 | t
(1 row)

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

LDAP_LINK_DRYRUN_SEARCH (
    'LDAPLinkURL',
    'LDAPLinkBindDN',
    'LDAPLinkBindPswd',
    'LDAPLinkSearchBase',
    'LDAPLinkScope',
    'LDAPLinkFilterUser',
    'LDAPLinkFilterGroup',
    'LDAPLinkUserName',
    'LDAPLinkGroupName',
    'LDAPLinkGroupMembers',
    [LDAPLinkSearchTimeout],
    ['LDAPLinkJoinAttr']
)

Privileges

Superuser

Examples

This searches for users and groups in the LDAP server. In this case, the LDAPLinkSearchBase parameter specifies the dc.com domain and a sub scope, which replicates the entire subtree under the DN.

To further filter results, the function checks for users and groups with the person and group objectClass attributes. It then searches the group attribute cn, identifying members of that group with the member attribute, and then identifying those individual users with the attribute uid.

=> SELECT LDAP_LINK_DRYRUN_SEARCH('ldap://example.dc.com','CN=amir,OU=QA,DC=dc,DC=com','$vertica$','dc=DC,dc=com','sub',
'(objectClass=person)','(objectClass=group)','uid','cn','member',10,'dn');

                ldap_link_dryrun_search
--------------------------------------------------------------------------------
Dry Run Search Completed. Query v_monitor.ldap_link_dryrun_events for results.

To check the results of the search, query the system table LDAP_LINK_DRYRUN_EVENTS.

=> SELECT event_timestamp, event_type, entry_name, ldapurihash, link_scope, search_base from LDAP_LINK_DRYRUN_EVENTS;
        event_timestamp          |    event_type    |       entry_name       | ldapurihash | link_scope | search_base
---------------------------------+------------------+------------------------+-------------+------------+--------------
2020-01-03 21:03:26.411753+05:30 | BIND_STARTED     | ---------------------- |           0 | sub        | dc=DC,dc=com
2020-01-03 21:03:26.422188+05:30 | BIND_FINISHED    | ---------------------- |           0 | sub        | dc=DC,dc=com
2020-01-03 21:03:26.422223+05:30 | SYNC_STARTED     | ---------------------- |           0 | sub        | dc=DC,dc=com
2020-01-03 21:03:26.422229+05:30 | SEARCH_STARTED   | **********             |           0 | sub        | dc=DC,dc=com
2020-01-03 21:03:32.043107+05:30 | LDAP_GROUP_FOUND | Account Operators      |           0 | sub        | dc=DC,dc=com
2020-01-03 21:03:32.04312+05:30  | LDAP_GROUP_FOUND | Administrators         |           0 | sub        | dc=DC,dc=com
2020-01-03 21:03:32.043182+05:30 | LDAP_USER_FOUND  | user1                  |           0 | sub        | dc=DC,dc=com
2020-01-03 21:03:32.043186+05:30 | LDAP_USER_FOUND  | user2                  |           0 | sub        | dc=DC,dc=com
2020-01-03 21:03:32.04319+05:30  | SEARCH_FINISHED  | **********             |           0 | sub        | dc=DC,dc=com

8.3 - LDAP_LINK_DRYRUN_SYNC

Takes a set of LDAP Link connection and search parameters as arguments and begins a dry run synchronization between the database and the LDAP server, which maps and synchronizes the LDAP server's users and groups with their equivalents in Vertica.

The dryrun and LDAP_LINK_SYNC_START functions must be run from the clerk node. To determine the clerk node, query NODE_RESOURCES:

=> SELECT node_name, dbclerk FROM node_resources WHERE dbclerk='t';
    node_name     | dbclerk
------------------+---------
 v_vmart_node0001 | t
(1 row)

You can view the results of the dry run in the system table LDAP_LINK_DRYRUN_EVENTS.

To cancel an in-progress synchronization, use LDAP_LINK_SYNC_CANCEL.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

LDAP_LINK_DRYRUN_SYNC (
    'LDAPLinkURL',
    'LDAPLinkBindDN',
    'LDAPLinkBindPswd',
    'LDAPLinkSearchBase',
    'LDAPLinkScope',
    'LDAPLinkFilterUser',
    'LDAPLinkFilterGroup',
    'LDAPLinkUserName',
    'LDAPLinkGroupName',
    'LDAPLinkGroupMembers',
    [LDAPLinkSearchTimeout],
    ['LDAPLinkJoinAttr']
)

Privileges

Superuser

Examples

To perform a dry run to map the users and groups returned from LDAP_LINK_DRYRUN_SEARCH, pass the same parameters as arguments to LDAP_LINK_DRYRUN_SYNC.

=> SELECT LDAP_LINK_DRYRUN_SYNC('ldap://example.dc.com','CN=amir,OU=QA,DC=dc,DC=com','$vertica$','dc=DC,dc=com','sub',
'(objectClass=person)','(objectClass=group)','uid','cn','member',10,'dn');

                          LDAP_LINK_DRYRUN_SYNC
------------------------------------------------------------------------------------------
Dry Run Connect and Sync Completed. Query v_monitor.ldap_link_dryrun_events for results.

To check the results of the sync, query the system table LDAP_LINK_DRYRUN_EVENTS.

=> SELECT event_timestamp, event_type, entry_name, ldapurihash, link_scope, search_base from LDAP_LINK_DRYRUN_EVENTS;
        event_timestamp          |     event_type      |       entry_name       | ldapurihash | link_scope | search_base
---------------------------------+---------------------+------------------------+-------------+------------+--------------
2020-01-03 21:08:30.883783+05:30 | BIND_STARTED        | ---------------------- |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:30.890574+05:30 | BIND_FINISHED       | ---------------------- |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:30.890602+05:30 | SYNC_STARTED        | ---------------------- |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:30.890605+05:30 | SEARCH_STARTED      | **********             |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939369+05:30 | LDAP_GROUP_FOUND    | Account Operators      |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939395+05:30 | LDAP_GROUP_FOUND    | Administrators         |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939461+05:30 | LDAP_USER_FOUND     | user1                  |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939463+05:30 | LDAP_USER_FOUND     | user2                  |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939468+05:30 | SEARCH_FINISHED     | **********             |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939718+05:30 | PROCESSING_STARTED  | **********             |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939887+05:30 | USER_CREATED        | user1                  |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939895+05:30 | USER_CREATED        | user2                  |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939949+05:30 | ROLE_CREATED        | Account Operators      |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.939959+05:30 | ROLE_CREATED        | Administrators         |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.940603+05:30 | PROCESSING_FINISHED | **********             |           0 | sub        | dc=DC,dc=com
2020-01-03 21:08:31.940613+05:30 | SYNC_FINISHED       | ---------------------- |           0 | sub        | dc=DC,dc=com

8.4 - LDAP_LINK_SYNC_CANCEL

Cancels in-progress LDAP Link synchronizations (including those started by LDAP_LINK_DRYRUN_SYNC) between the LDAP server and Vertica.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

ldap_link_sync_cancel()

Privileges

Superuser

Examples

=> SELECT ldap_link_sync_cancel();

8.5 - LDAP_LINK_SYNC_START

Begins the synchronization between the LDAP server and Vertica immediately rather than waiting for the interval set in LDAPLinkInterval.

The dryrun and LDAP_LINK_SYNC_START functions must be run from the clerk node. To determine the clerk node, query NODE_RESOURCES:

=> SELECT node_name, dbclerk FROM node_resources WHERE dbclerk='t';
    node_name     | dbclerk
------------------+---------
 v_vmart_node0001 | t
(1 row)

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

ldap_link_sync_start()

Privileges

Superuser

Examples

=> SELECT ldap_link_sync_start();

9 - License functions

This section contains functions that monitor Vertica license status and compliance.

9.1 - AUDIT

Returns the raw data size (in bytes) of a database, schema, or table as it is counted in an audit of the database size.

Returns the raw data size (in bytes) of a database, schema, or table as it is counted in an audit of the database size. Unless you specify zero error tolerance and 100 percent confidence level, AUDIT returns only approximate results that can vary over multiple iterations.

Important

The data size returned by AUDIT should not be compared with the compressed data size of objects reported in the USED_BYTES column of system tables like STORAGE_CONTAINERS and PROJECTION_STORAGE.

AUDIT estimates the size for data in Vertica tables using the same data sampling method that Vertica uses to determine if a database complies with the licensed database size allowance. Vertica does not use these results to determine whether the size of the database complies with the Vertica license's data allowance. For details, see Auditing database size.

For data stored in external tables based on ORC or Parquet format, AUDIT uses the total size of the data files. This value is never estimated—it is read from the file system storing the ORC or Parquet files (either the Vertica node's local file system, S3, or HDFS).

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

AUDIT('[[[database.]schema.]scope ]'[, 'granularity'] [, error-tolerance[, confidence-level]] )

Parameters

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

scope

Specifies the extent of the audit:

Empty string ('') audits the entire database.
The name of the schema or table to audit.

The schema or table to audit. To audit the database, set this parameter to an empty string.

granularity

The level at which the audit reports its results, one of the following strings:

database
schema
table

The level of granularity must be equal to or less than the granularity of scope. If you omit this parameter, granularity is set to the same level as scope. Thus, if online_sales is a schema, the following statements are identical:

AUDIT('online_sales', 'schema');
AUDIT('online_sales');

If AUDIT sets granularity to a level lower than the target object, it returns with a message that refers you to system table USER_AUDITS. For details, see Querying V_CATALOG.USER_AUDITS, below.

error-tolerance

Specifies the percentage margin of error allowed in the audit estimate. Enter the tolerance value as a decimal number, between 0 and 100. The default value is 5, for a 5% margin of error.

This argument has no effect on audits of external tables based on ORC or Parquet files. Audits of these tables always returns the actual size of the underlying data files.

Setting this value to 0 results in a full database audit, which is very resource intensive, as AUDIT analyzes the entire database. A full database audit significantly impacts performance, so Vertica does not recommend it for a production database.

Caution

Due to the iterative sampling that the auditing process uses, setting the error tolerance to a small fraction of a percent (for example, 0.00001) can cause AUDIT to run for a longer period than a full database audit. The lower you specify this value, the more resources the audit uses, as it performs more data sampling.

confidence-level

Specifies the statistical confidence level percentage of the estimate. Enter the confidence value as a decimal number, between 0 and 100. The default value is 99, indicating a confidence level of 99%.

This argument has no effect on audits of external tables based on ORC or Parquet files. Audits of these tables always returns the actual size of the underlying data files.

The higher the confidence value, the more resources the function uses, as it performs more data sampling. Setting this value to 100 results in a full audit of the database, which is very resource intensive, as the function analyzes all of the database. A full database audit significantly impacts performance, so Vertica does not recommend it for a production database.

Privileges

Superuser, or the following privileges:

SELECT privilege on the target tables
USAGE privilege on the target schemas

Note

If you audit a schema or the database, Vertica only returns the size of all objects that you have privileges to access within the audited object, as described above.

Querying V_CATALOG.USER_AUDITS

If AUDIT sets granularity to a level lower than the target object, it returns with a message that refers you to system table USER_AUDITS. To obtain audit data on objects of the specified granularity, query this table. For example, the following query seeks to audit all tables in the store schema:

=> SELECT AUDIT('store', 'table');
                           AUDIT
-----------------------------------------------------------
 See table sizes in v_catalog.user_audits for schema store
(1 row)

The next query queries USER_AUDITS and obtains the latest audits on those tables:

=> SELECT object_name, AVG(size_bytes)::int size_bytes, MAX(audit_start_timestamp::date) audit_start
      FROM user_audits WHERE object_schema='store'
      GROUP BY rollup(object_name) HAVING GROUPING_ID(object_name) < 1 ORDER BY GROUPING_ID();
    object_name    | size_bytes | audit_start
-------------------+------------+-------------
 store_dimension   |      22067 | 2017-10-26
 store_orders_fact |   27201312 | 2017-10-26
 store_sales_fact  |  301260170 | 2017-10-26
(3 rows)

Examples

See Auditing database size.

9.2 - AUDIT_FLEX

Returns the estimated ROS size of raw columns, equivalent to the export size of the flex data in the audited objects.

Returns the estimated ROS size of __raw__ columns, equivalent to the export size of the flex data in the audited objects. You can audit all flex data in the database, or narrow the audit scope to a specific flex table, projection, or schema. Vertica stores the audit results in system table USER_AUDITS.

The audit excludes the following:

Flex keys
Other columns in the audited tables.
Temporary flex tables

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

AUDIT_FLEX ('[scope]')

Parameters

scope

Specifies the extent of the audit:

Empty string ('') audits all flexible tables in the database.
The name of a schema, projection, or flex table.

Privileges

Superuser, or the following privileges:

SELECT privilege on the target tables
USAGE privilege on the target schemas

Note

If you audit a schema or the database, Vertica only returns the size of all objects that you have privileges to access within the audited object, as described above.

Examples

Audit all flex tables in the current database:

dbs=> select audit_flex('');
 audit_flex
------------
 8567679
(1 row)

Audit the flex tables in schema public:


dbs=> select audit_flex('public');
audit_flex
------------
8567679
(1 row)

Audit the flex data in projection bakery_b0:

dbs=> select audit_flex('bakery_b0');
 audit_flex
------------
 8566723
(1 row)

Audit flex table bakery:

dbs=> select audit_flex('bakery');
 audit_flex
------------
 8566723
(1 row)

To report the results of all audits saved in the USER_AUDITS, the following shows part of an extended display from the system table showing an audit run on a schema called test, and the entire database, dbs:

dbs=> \x
Expanded display is on.

dbs=> select * from user_audits;
-[ RECORD 1 ]-------------------------+------------------------------
size_bytes                            | 0
user_id                               | 45035996273704962
user_name                             | release
object_id                             | 45035996273736664
object_type                           | SCHEMA
object_schema                         |
object_name                           | test
audit_start_timestamp                 | 2014-02-04 14:52:15.126592-05
audit_end_timestamp                   | 2014-02-04 14:52:15.139475-05
confidence_level_percent              | 99
error_tolerance_percent               | 5
used_sampling                         | f
confidence_interval_lower_bound_bytes | 0
confidence_interval_upper_bound_bytes | 0
sample_count                          | 0
cell_count                            | 0
-[ RECORD 2 ]-------------------------+------------------------------
size_bytes                            | 38051
user_id                               | 45035996273704962
user_name                             | release
object_id                             | 45035996273704974
object_type                           | DATABASE
object_schema                         |
object_name                           | dbs
audit_start_timestamp                 | 2014-02-05 13:44:41.11926-05
audit_end_timestamp                   | 2014-02-05 13:44:41.227035-05
confidence_level_percent              | 99
error_tolerance_percent               | 5
used_sampling                         | f
confidence_interval_lower_bound_bytes | 38051
confidence_interval_upper_bound_bytes | 38051
sample_count                          | 0
cell_count                            | 0
-[ RECORD 3 ]-------------------------+------------------------------
...

9.3 - AUDIT_LICENSE_SIZE

Triggers an immediate audit of the database size to determine if it is in compliance with the raw data storage allowance included in your Vertica licenses.

If you use ORC or Parquet data stored in HDFS, results are only accurate if you run this function as a user who has access to all HDFS data. Either run the query with a principal that has read access to all such data, or use a Hadoop delegation token that grants this access. For more information about using delegation tokens, see Accessing kerberized HDFS data.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

AUDIT_LICENSE_SIZE()

Privileges

Superuser

Examples

=> SELECT audit_license_size();
 audit_license_size
--------------------
Raw Data Size: 0.00TB +/- 0.00TB
License Size : 10.00TB
Utilization  : 0%
Audit Time   : 2015-09-24 12:19:15.425486-04
Compliance Status : The database is in compliance with respect to raw data size.

License End Date: 2015-11-23 00:00:00 Days Remaining: 60.53
(1 row)

9.4 - AUDIT_LICENSE_TERM

Triggers an immediate audit to determine if the Vertica license has expired.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

AUDIT_LICENSE_TERM()

Privileges

Superuser

Examples

=> SELECT audit_license_term();
 audit_license_term
--------------------
Raw Data Size: 0.00TB +/- 0.00TB
License Size : 10.00TB
Utilization  : 0%
Audit Time   : 2015-09-24 12:19:15.425486-04
Compliance Status : The database is in compliance with respect to raw data size.

License End Date: 2015-11-23 00:00:00 Days Remaining: 60.53
(1 row)

9.5 - DISPLAY_LICENSE

Returns the terms of your Vertica license.

Returns the terms of your Vertica license. The information this function displays is:

The start and end dates for which the license is valid (or "Perpetual" if the license has no expiration).
The number of days you are allowed to use Vertica after your license term expires (the grace period)
The amount of data your database can store, if your license includes a data allowance.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

DISPLAY_LICENSE()

Privileges

None

Examples

=> SELECT DISPLAY_LICENSE();
                  DISPLAY_LICENSE
---------------------------------------------------
 Vertica Systems, Inc.
2007-08-03
Perpetual
500GB

(1 row)

9.6 - GET_AUDIT_TIME

Reports the time when the automatic audit of database size occurs.

Reports the time when the automatic audit of database size occurs. Vertica performs this audit if your Vertica license includes a data size allowance. For details of this audit, see Managing licenses in the Administrator's Guide. To change the time the audit runs, use the SET_AUDIT_TIME function.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

GET_AUDIT_TIME()

Privileges

None

Examples

=> SELECT get_audit_time();
get_audit_time
-----------------------------------------------------
 The audit is scheduled to run at 11:59 PM each day.
(1 row)

9.7 - GET_COMPLIANCE_STATUS

Displays whether your database is in compliance with your Vertica license agreement.

Displays whether your database is in compliance with your Vertica license agreement. This information includes the results of Vertica's most recent audit of the database size (if your license has a data allowance as part of its terms), the license term (if your license has an end date), and the number of nodes (if your license has a node limit).

GET_COMPLIANCE_STATUS measures data allowance by TBs (where a TB equals 1024⁴ bytes).

The information displayed by GET_COMPLIANCE_STATUS includes:

The estimated size of the database (see Auditing database size for an explanation of the size estimate).
The raw data size allowed by your Vertica license.
The percentage of your allowance that your database is currently using.
The number of nodes and license limit.
The date and time of the last audit.
Whether your database complies with the data allowance terms of your license agreement.
The end date of your license.
How many days remain until your license expires.

Note

If your license does not have a data allowance, end date, or node limit, some of the values might not appear in the output for GET_COMPLIANCE_STATUS.

If the audit shows your license is not in compliance with your data allowance, you should either delete data to bring the size of the database under the licensed amount, or upgrade your license. If your license term has expired, you should contact Vertica immediately to renew your license. See Managing licenses for further details.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

GET_COMPLIANCE_STATUS()

Privileges

None

Examples

=> SELECT GET_COMPLIANCE_STATUS();
 get_compliance_status
--------------------
Raw Data Size: 0.00TB +/- 0.00TB
License Size : 10.00TB
Utilization  : 0%
Audit Time   : 2015-09-24 12:19:15.425486-04
Compliance Status : The database is in compliance with respect to raw data size.

License End Date: 2015-11-23 00:00:00 Days Remaining: 60.53
(1 row)

The following example shows output for a Vertica for SQL on Apache Hadoop cluster.

=> SELECT GET_COMPLIANCE_STATUS();
 get_compliance_status
--------------------
Node count : 4
License Node limit : 5
No size-compliance concerns for an Unlimited license

No expiration date for a Perpetual license
(1 row)

9.8 - SET_AUDIT_TIME

Sets the time that Vertica performs automatic database size audit to determine if the size of the database is compliant with the raw data allowance in your Vertica license.

Sets the time that Vertica performs automatic database size audit to determine if the size of the database is compliant with the raw data allowance in your Vertica license. Use this function if the audits are currently scheduled to occur during your database's peak activity time. This is normally not a concern, since the automatic audit has little impact on database performance.

Audits are scheduled by the preceding audit, so changing the audit time does not affect the next scheduled audit. For example, if your next audit is scheduled to take place at 11:59PM and you use SET_AUDIT_TIME to change the audit schedule 3AM, the previously scheduled 11:59PM audit still runs. As that audit finishes, it schedules the next audit to occur at 3AM.

Vertica always performs the next scheduled audit even where you have changed the audit time using SET_AUDIT_TIME and then triggered an automatic audit by issuing the statement, SELECT AUDIT_LICENSE_SIZE. Only after the next scheduled audit does Vertica begin auditing at the new time you set using SET_AUDIT_TIME. Thereafter, Vertica audits at the new time.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SET_AUDIT_TIME(time)

time: A string containing the time in 'HH:MM AM/PM' format (for example, '1:00 AM') when the audit should run daily.

Privileges

Superuser

Examples

=> SELECT SET_AUDIT_TIME('3:00 AM');
                            SET_AUDIT_TIME
-----------------------------------------------------------------------
 The scheduled audit time will be set to 3:00 AM after the next audit.
(1 row)

10 - Notifier functions

This section contains functions for using and managing the notifier.

10.1 - GET_DATA_COLLECTOR_NOTIFY_POLICY

Lists any notification policies set on a component.

Lists any notification policies set on a Data collector component.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

GET_DATA_COLLECTOR_NOTIFY_POLICY('component')

component

Name of the Data Collector component to check for notification policies.

Query system table DATA_COLLECTOR for component names. For example:

=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
   component    |          description
----------------+-------------------------------
 DepotEvictions | Files evicted from the Depot
 DepotFetches   | Files fetched to the Depot
 DepotUploads   | Files Uploaded from the Depot
(3 rows)

Examples

=> SELECT GET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures');
                   GET_DATA_COLLECTOR_NOTIFY_POLICY
----------------------------------------------------------------------
 Notifiable;  Notifier: vertica_stats; Channel: vertica_notifications
(1 row)

The following example shows the output from the function when there is no notification policy for the component:


=> SELECT GET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures');
 GET_DATA_COLLECTOR_NOTIFY_POLICY
----------------------------------
 Not notifiable;
(1 row)

10.2 - NOTIFY

Sends a specified message to a NOTIFIER.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

NOTIFY ( 'message', 'notifier', 'target-topic' )

Parameters

message

The message to send to the endpoint.

notifier

The name of the NOTIFIER.

target-topic

String that specifies one of the following based on the notifier type:

Kafka: The name of an existing destination Kafka topic for the message.

Note
If the topic doesn't already exist, you can configure your Kafka broker to automatically create the specified topic. For more information, see the Kafka documentataion.
Syslog: The ProblemDescription subject and channel value.
SNS: The topic ARN.

Privileges

Superuser

Examples

Send a message to confirm that an ETL job is complete:

=> SELECT NOTIFY('ETL Done!', 'my_notifier', 'DB_activity_topic');

10.3 - SET_DATA_COLLECTOR_NOTIFY_POLICY

Creates/enables notification policies for a component.

Creates/enables notification policies for a Data collector component. Notification policies automatically send messages to the specified NOTIFIER when certain events occur.

To view existing notification policies on a Data Collector component, see GET_DATA_COLLECTOR_NOTIFY_POLICY.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SET_DATA_COLLECTOR_NOTIFY_POLICY('component','notifier', 'topic', enabled)

component

Name of the component whose change will be reported via the notifier.

Query system table DATA_COLLECTOR for component names. For example:

=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component;
   component    |          description
----------------+-------------------------------
 DepotEvictions | Files evicted from the Depot
 DepotFetches   | Files fetched to the Depot
 DepotUploads   | Files Uploaded from the Depot
(3 rows)

notifier

Name of the notifier that will send the message.

topic

One of the following:

Kafka: The name of the Kafka topic that will receive the notification message.

Note
If the topic doesn't already exist, you can configure your Kafka broker to automatically create the specified topic. For more information, see the Kafka documentataion.
Syslog: The subject of the field ProblemDescription.
SNS: The topic ARN.

enabled

Boolean value that specifies whether this policy is enabled. Set to TRUE to enable reporting component changes. Set to FALSE to disable the notifier.

Examples

The following example creates an SNS topic, subscribes to it with an SQS queue, and then configures an SNS notifier for the DC component LoginFailures:

Create an SNS topic.
Create an SQS queue.
Subscribe the SQS queue to the SNS topic.

Set SNSAuth with your AWS credentials:

=> ALTER DATABASE DEFAULT SET SNSAuth='VNDDNVOPIUQF917O5PDB:+mcnVONVIbjOnf1ekNis7nm3mE83u9fjdwmlq36Z';

Set SNSRegion:

=> ALTER DATABASE DEFAULT SET SNSRegion='us-east-1'

Enable HTTPS:

=> ALTER DATABASE DEFAULT SET SNSEnableHttps=1;

Create an SNS notifier:

=> CREATE NOTIFIER v_sns_notifier ACTION 'sns' MAXPAYLOAD '256K' MAXMEMORYSIZE '10M' CHECK COMMITTED;

Verify that the SNS notifier, SNS topic, and SQS queue are properly configured:
1. Manually send a message from the notifier to the SNS topic with NOTIFY:
```
=> SELECT NOTIFY('test message', 'v_sns_notifier', 'arn:aws:sns:us-east-1:123456789012:MyTopic')
```
2. Poll the SQS queue for your message.

Attach the SNS notifier to the LoginFailures component with SET_DATA_COLLECTOR_NOTIFY_POLICY:

=> SELECT SET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures', 'v_sns_notifier', 'Login failed!', true)

Kafka notifier

To be notified of failed login attempts, you can create a notifier that sends a notification when the DC component LoginFailures updates. The TLSMODE 'verify-ca' verifies that the server's certificate is signed by a trusted CA.

=> CREATE NOTIFIER vertica_stats ACTION 'kafka://kafka01.example.com:9092' MAXMEMORYSIZE '10M' TLSMODE 'verify-ca';
CREATE NOTIFIER
=> SELECT SET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures','vertica_stats', 'vertica_notifications', true);
SET_DATA_COLLECTOR_NOTIFY_POLICY
----------------------------------
 SET
(1 row)

The following example shows how to disable the policy created in the previous example:

=> SELECT SET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures','vertica_stats', 'vertica_notifications', false);
 SET_DATA_COLLECTOR_NOTIFY_POLICY
----------------------------------
 SET
(1 row)


=> SELECT GET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures');
 GET_DATA_COLLECTOR_NOTIFY_POLICY
----------------------------------
 Not notifiable;
(1 row)

Syslog notifier

The following example creates a notifier that writes a message to syslog when the Data collector (DC) component LoginFailures updates:

Enable syslog notifiers for the current database:

=> ALTER DATABASE DEFAULT SET SyslogEnabled = 1;

Create and enable a syslog notifier v_syslog_notifier:

=> CREATE NOTIFIER v_syslog_notifier ACTION 'syslog'
    ENABLE
    MAXMEMORYSIZE '10M'
    IDENTIFIED BY 'f8b0278a-3282-4e1a-9c86-e0f3f042a971'
    PARAMETERS 'eventSeverity = 5';

Configure the syslog notifier v_syslog_notifier for updates to the LoginFailures DC component with SET_DATA_COLLECTOR_NOTIFY_POLICY:

=> SELECT SET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures','v_syslog_notifier', 'Login failed!', true);

This notifier writes the following message to syslog (default location: /var/log/messages) when a user fails to authenticate as the user Bob:

Apr 25 16:04:58
vertica_host_01
vertica:
    Event Posted:
        Event Code:21
        Event Id:0
        Event Severity: Notice [5]
        PostedTimestamp: 2022-04-25 16:04:58.083063
        ExpirationTimestamp: 2022-04-25 16:04:58.083063
        EventCodeDescription: Notifier
        ProblemDescription: (Login failed!)
    {
       "_db":"VMart",
       "_schema":"v_internal",
       "_table":"dc_login_failures",
       "_uuid":"f8b0278a-3282-4e1a-9c86-e0f3f042a971",
       "authentication_method":"Reject",
       "client_authentication_name":"default: Reject",
       "client_hostname":"::1",
       "client_label":"",
       "client_os_user_name":"dbadmin",
       "client_pid":523418,
       "client_version":"",
       "database_name":"dbadmin",
       "effective_protocol":"3.8",
       "node_name":"v_vmart_node0001",
       "reason":"REJECT",
       "requested_protocol":"3.8",
       "ssl_client_fingerprint":"",
       "ssl_client_subject":"",
       "time":"2022-04-25 16:04:58.082568-05",
       "user_name":"Bob"
    }#012
    DatabaseName: VMart
    Hostname: vertica_host_01

11 - Partition functions

This section contains partition management functions specific to Vertica.

11.1 - CALENDAR_HIERARCHY_DAY

Specifies to group DATE partition keys into a hierarchy of years, months, and days.

Specifies to group DATE partition keys into a hierarchy of years, months, and days. The Vertica Tuple Mover regularly evaluates partition keys against the current date, and merges partitions as needed into the appropriate year and month partition groups.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CALENDAR_HIERARCHY_DAY( partition-expression[, active-months[, active-years] ] )

Parameters

partition-expression

The DATE expression on which to group partition keys, which must be identical to the table's PARTITION BY expression.

active-months

An integer ≥ 0 that specifies how many months preceding MONTH(CURRENT_DATE) to store unique partition keys in separate partitions.

If you specify 1, only partition keys of the current month are stored in separate partitions.

If you specify 0, all partition keys of the current month are merged into a partition group for that month.

For details, see Hierarchical partitioning.

Default: 2

active-years

An integer ≥ 0, specifies how many years preceding YEAR(CURRENT_DATE) to partition group keys by month in separate partitions.

If you specify 1, only partition keys of the current year are stored in month partition groups.

If you specify 0, all partition keys of the current and previous years are merged into year partition groups.

For details, see Hierarchical partitioning.

Default: 2

Important

The CALENDAR_HIERARCHY_DAY algorithm assumes that most table activity is focused on recent dates. Setting active-years and active-months to a low number ≥ 2 serves to isolate most merge activity to date-specific containers, and incurs minimal overhead. Vertica recommends that you use the default setting of 2 for active-years and active-months. For most users, these settings achieve an optimal balance between ROS storage and performance.

As a best practice, never set active-years and active-months to 0.

Usage

Specify this function in a table partition clause, as its GROUP BY expression:

PARTITION BY partition-expression
  GROUP BY CALENDAR_HIERARCHY_DAY(
     group-expression
      [, active-months[, active-years] ] )

For example:

=> CREATE TABLE public.store_orders
(
    order_no int,
    order_date timestamp NOT NULL,
    shipper varchar(20),
    ship_date date
);
...
=> ALTER TABLE public.store_orders
      PARTITION BY order_date::DATE
      GROUP BY CALENDAR_HIERARCHY_DAY(order_date::DATE, 3, 2) REORGANIZE;

Examples

See Hierarchical partitioning.

11.2 - COPY_PARTITIONS_TO_TABLE

Copies partitions from one table to another.

Copies partitions from one table to another. This lightweight partition copy increases performance by initially sharing the same storage between two tables. After the copy operation is complete, the tables are independent of each other. Users can perform operations on one table without impacting the other. These operations can increase the overall storage required for both tables.

Note

Although they share storage space, Vertica considers the partitions as discrete objects for license capacity purposes. For example, copying a one TB partition would only consume one TB of space. Your Vertica license, however, considers them as separate objects consuming two TB of space.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

COPY_PARTITIONS_TO_TABLE (
    '[[database.]schema.]source-table',
    'min-range-value',
    'max-range-value',
    '[[database.]schema.]target-table'
     [, 'force-split']
)

Arguments

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

source-table

The source table of the partitions to copy.

min-range-value, max-range-value

The minimum and maximum value of partition keys to copy, where min‑range‑value must be ≤ max‑range‑value. To specify a single partition key, min‑range‑value and max‑range‑value must be equal.

target-table

The target table of the partitions to copy. If the table does not exist, Vertica creates a table from the source table's definition, by calling CREATE TABLE with LIKE and INCLUDING PROJECTIONS clause. The new table inherits ownership from the source table. For details, see Replicating a table.

force-split

Optional Boolean argument, specifies whether to split ROS containers if the range of partition keys spans multiple containers or part of a single container:

true: Split ROS containers as needed.
false (default): Return with an error if ROS containers must be split to implement this operation.

Privileges

Non-superuser, one of the following:

Owner of source and target tables
TRUNCATE (if force-split is true) and SELECT on the source table, INSERT on the target table

If the target table does not exist, you must also have CREATE privileges on the target schema to enable table creation.

Table attribute requirements

The following attributes of both tables must be identical:

Column definitions, including NULL/NOT NULL constraints
Segmentation
Partition clause
Number of projections
Projection sort order
Primary and unique key constraints. However, the key constraints do not have to be identically enabled. For more information on constraints, see Constraints.

Note
If the target table has primary or unique key constraints enabled and copying or moving the partitions will insert duplicate key values into the target table, Vertica rolls back the operation.
Check constraints. For MOVE_PARTITIONS_TO_TABLE and COPY_PARTITIONS_TO_TABLE, Vertica enforces enabled check constraints on the target table only. For SWAP_PARTITIONS_BETWEEN_TABLES, Vertica enforces enabled check constraints on both tables. If there is a violation of an enabled check constraint, Vertica rolls back the operation.
Number and definitions of text indices.

Additionally, If access policies exist on the source table, the following must be true:

Access policies on both tables must be identical.
One of the following must be true:
- The executing user owns the source table.
- AccessPolicyManagementSuperuserOnly is set to true. See Managing access policies for details.

Table restrictions

The following restrictions apply to the source and target tables:

If the source and target partitions are in different storage tiers, Vertica returns a warning but the operation proceeds. The partitions remain in their existing storage tier.
The target table cannot be immutable.
The following tables cannot be used as sources or targets:
- Temporary tables
- Virtual tables
- System tables
- External tables

Examples

If you call COPY_PARTITIONS_TO_TABLE and the target table does not exist, the function creates the table automatically. In the following example, the target table partn_backup.tradfes_200801 does not exist. COPY_PARTITIONS_TO_TABLE creates the table and replicates the partition. Vertica also copies all the constraints associated with the source table except foreign key constraints.

=> SELECT COPY_PARTITIONS_TO_TABLE (
          'prod_trades',
          '200801',
          '200801',
          'partn_backup.trades_200801');
COPY_PARTITIONS_TO_TABLE
-------------------------------------------------
 1 distinct partition values copied at epoch 15.
(1 row)

11.3 - DROP_PARTITIONS

Drops the specified table partition keys.

Note

This function supersedes meta-function DROP_PARTITION, which was deprecated in Vertica 9.0.

Drops the specified table partition keys.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

DROP_PARTITIONS (
    '[[database.]schema.]table-name',
    'min-range-value',
    'max-range-value'
    [, 'force-split']
)

Arguments

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

table-name

The target table. The table cannot be used as a dimension table in a pre-join projection and cannot have out-of-date (unrefreshed) projections.

min-range-value, max-range-value

The minimum and maximum value of partition keys to drop, where min‑range‑value must be ≤ max‑range‑value. To specify a single partition key, min‑range‑value and max‑range‑value must be equal.

force-split

Optional Boolean argument, specifies whether to split ROS containers if the range of partition keys spans multiple containers or part of a single container:

true: Split ROS containers as needed.
false (default): Return with an error if ROS containers must be split to implement this operation.

Note

In rare cases, DROP_PARTITIONS executes at the same time as a mergeout operation on the same ROS container. As a result, the function cannot split the container as specified and returns with an error. When this happens, call DROP_PARTITIONS again.

Privileges

One of the following:

DBADMIN
Table owner
USAGE privileges on the table schema and TRUNCATE privileges on the table

Examples

See Dropping partitions.

11.4 - DUMP_PROJECTION_PARTITION_KEYS

Dumps the partition keys of the specified projection.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

DUMP_PROJECTION_PARTITION_KEYS( '[[database.]schema.]projection-name')

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
projection-name: Projection name

Privileges

Non-superuser: TRUNCATE on anchor table

Examples

The following statements create the table and projection online_sales.online_sales_fact and online_sales.online_sales_fact_rep, respectively, and partitions table data by the column call_center_key:

=> CREATE TABLE online_sales.online_sales_fact
(
    sale_date_key int NOT NULL,
    ship_date_key int NOT NULL,
    product_key int NOT NULL,
    product_version int NOT NULL,
    customer_key int NOT NULL,
    call_center_key int NOT NULL,
    online_page_key int NOT NULL,
    shipping_key int NOT NULL,
    warehouse_key int NOT NULL,
    promotion_key int NOT NULL,
    pos_transaction_number int NOT NULL,
    sales_quantity int,
    sales_dollar_amount float,
    ship_dollar_amount float,
    net_dollar_amount float,
    cost_dollar_amount float,
    gross_profit_dollar_amount float,
    transaction_type varchar(16)
)
PARTITION BY (online_sales_fact.call_center_key);

=> CREATE PROJECTION online_sales.online_sales_fact_rep AS SELECT * from online_sales.online_sales_fact unsegmented all nodes;

The following DUMP_PROJECTION_PARTITION_KEYS statement dumps the partition key from the projection online_sales.online_sales_fact_rep:

=> SELECT DUMP_PROJECTION_PARTITION_KEYS('online_sales.online_sales_fact_rep');

Partition keys on node v_vmart_node0001
  Projection 'online_sales_fact_rep'
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: 200
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: 199
   ...
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: 2
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: 1

 Partition keys on node v_vmart_node0002
  Projection 'online_sales_fact_rep'
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: 200
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: 199
...
(1 row)

11.5 - DUMP_TABLE_PARTITION_KEYS

Dumps the partition keys of all projections for the specified table.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

DUMP_TABLE_PARTITION_KEYS ( '[[database.]schema.]table-name' )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table-name: Name of the table

Privileges

Non-superuser: TRUNCATE on table

Examples

The following example creates a simple table called states and partitions the data by state:

=> CREATE TABLE states (year INTEGER NOT NULL,
       state VARCHAR NOT NULL)
       PARTITION BY state;
=> CREATE PROJECTION states_p (state, year) AS
       SELECT * FROM states
       ORDER BY state, year UNSEGMENTED ALL NODES;

Now dump the partition keys of all projections anchored on table states:

=> SELECT DUMP_TABLE_PARTITION_KEYS( 'states' );
      DUMP_TABLE_PARTITION_KEYS                                                               --------------------------------------------------------------------------------------------
 Partition keys on node v_vmart_node0001
  Projection 'states_p'
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: VT
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: PA
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: NY
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: MA

 Partition keys on node v_vmart_node0002
...
(1 row)

11.6 - MOVE_PARTITIONS_TO_TABLE

Moves partitions from one table to another.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

MOVE_PARTITIONS_TO_TABLE (
    '[[database.]schema.]source-table',
    'min-range-value',
    'max-range-value',
    '[[database.]schema.]target-table'
     [, force-split]
)

Arguments

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

source-table

The source table of the partitions to move.

min-range-value, max-range-value

The minimum and maximum value of partition keys to move, where min‑range‑value must be ≤ max‑range‑value. To specify a single partition key, min‑range‑value and max‑range‑value must be equal.

target-table

The target table of the partitions to move. If the table does not exist, Vertica creates a table from the source table's definition, by calling CREATE TABLE with LIKE and INCLUDING PROJECTIONS clause. The new table inherits ownership from the source table. For details, see Replicating a table.

force-split

Optional Boolean argument, specifies whether to split ROS containers if the range of partition keys spans multiple containers or part of a single container:

true: Split ROS containers as needed.
false (default): Return with an error if ROS containers must be split to implement this operation.

Privileges

Non-superuser, one of the following:

Owner of source and target tables
SELECT, TRUNCATE on the source table, INSERT on the target table

If the target table does not exist, you must also have CREATE privileges on the target schema to enable table creation.

Table attribute requirements

The following attributes of both tables must be identical:

Column definitions, including NULL/NOT NULL constraints
Segmentation
Partition clause
Number of projections
Projection sort order
Primary and unique key constraints. However, the key constraints do not have to be identically enabled. For more information on constraints, see Constraints.

Note
If the target table has primary or unique key constraints enabled and copying or moving the partitions will insert duplicate key values into the target table, Vertica rolls back the operation.
Check constraints. For MOVE_PARTITIONS_TO_TABLE and COPY_PARTITIONS_TO_TABLE, Vertica enforces enabled check constraints on the target table only. For SWAP_PARTITIONS_BETWEEN_TABLES, Vertica enforces enabled check constraints on both tables. If there is a violation of an enabled check constraint, Vertica rolls back the operation.
Number and definitions of text indices.

Additionally, If access policies exist on the source table, the following must be true:

Access policies on both tables must be identical.
One of the following must be true:
- The executing user owns the source table.
- AccessPolicyManagementSuperuserOnly is set to true. See Managing access policies for details.

Table restrictions

The following restrictions apply to the source and target tables:

If the source and target partitions are in different storage tiers, Vertica returns a warning but the operation proceeds. The partitions remain in their existing storage tier.
The target table cannot be immutable.
The following tables cannot be used as sources or targets:
- Temporary tables
- Virtual tables
- System tables
- External tables

Examples

See Archiving partitions.

11.7 - PARTITION_PROJECTION

Splits containers for a specified projection.

Splits ROS containers for a specified projection. PARTITION_PROJECTION also purges data while partitioning ROS containers if deletes were applied before the AHM epoch.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

PARTITION_PROJECTION ( '[[database.]schema.]projection')

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
projection``: The projection to partition.

Privileges

Table owner
USAGE privilege on schema

Examples

In this example, PARTITION_PROJECTION forces a split of ROS containers on the states_p projection:

=> SELECT PARTITION_PROJECTION ('states_p');
  PARTITION_PROJECTION
------------------------
 Projection partitioned
(1 row)

11.8 - PARTITION_TABLE

Invokes the to reorganize ROS storage containers as needed to conform with the current partitioning policy.

Invokes the Tuple Mover to reorganize ROS storage containers as needed to conform with the current partitioning policy.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

PARTITION_TABLE ( '[schema.]table-name')

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table-name: The table to partition.

Privileges

Table owner
USAGE privilege on schema

Restrictions

You cannot run PARTITION_TABLE on a table that is an anchor table for a live aggregate projection or a Top-K projection.
To reorganize storage to conform to a new policy, run PARTITION_TABLE after changing the partition GROUP BY expression.

11.9 - PURGE_PARTITION

Purges a table partition of deleted rows.

Purges a table partition of deleted rows. Similar to PURGE and PURGE_PROJECTION, this function removes deleted data from physical storage so you can reuse the disk space. PURGE_PARTITION removes data only from the AHM epoch and earlier.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

PURGE_PARTITION ( '[[database.]schema.]table', partition-key )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table: The partitioned table to purge.
partition-key: The key of the partition to purge.

Privileges

Table owner
USAGE privilege on schema

Examples

The following example lists the count of deleted rows for each partition in a table, then calls PURGE_PARTITION() to purge the deleted rows from the data.

=> SELECT partition_key,table_schema,projection_name,sum(deleted_row_count)
   AS deleted_row_count FROM partitions
   GROUP BY partition_key,table_schema,projection_name
   ORDER BY partition_key;

 partition_key | table_schema | projection_name | deleted_row_count
---------------+--------------+-----------------+-------------------
 0             | public       | t_super         |                 2
 1             | public       | t_super         |                 2
 2             | public       | t_super         |                 2
 3             | public       | t_super         |                 2
 4             | public       | t_super         |                 2
 5             | public       | t_super         |                 2
 6             | public       | t_super         |                 2
 7             | public       | t_super         |                 2
 8             | public       | t_super         |                 2
 9             | public       | t_super         |                 1
(10 rows)
=> SELECT PURGE_PARTITION('t',5); -- Purge partition with key 5.
                            purge_partition
------------------------------------------------------------------------
 Task: merge partitions
(Table: public.t) (Projection: public.t_super)
(1 row)

=> SELECT partition_key,table_schema,projection_name,sum(deleted_row_count)
   AS deleted_row_count FROM partitions
   GROUP BY partition_key,table_schema,projection_name
   ORDER BY partition_key;


 partition_key | table_schema | projection_name | deleted_row_count
---------------+--------------+-----------------+-------------------
 0             | public       | t_super         |                 2
 1             | public       | t_super         |                 2
 2             | public       | t_super         |                 2
 3             | public       | t_super         |                 2
 4             | public       | t_super         |                 2
 5             | public       | t_super         |                 0
 6             | public       | t_super         |                 2
 7             | public       | t_super         |                 2
 8             | public       | t_super         |                 2
 9             | public       | t_super         |                 1
(10 rows)

11.10 - SWAP_PARTITIONS_BETWEEN_TABLES

Swaps partitions between two tables.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SWAP_PARTITIONS_BETWEEN_TABLES (
    '[[database.]schema.]staging-table',
    'min-range-value',
    'max-range-value',
    '[[database.]schema.]target-table'
     [, force-split]
)

Arguments

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

staging-table

The staging table from which to swap partitions.

min-range-value, max-range-value

The minimum and maximum value of partition keys to swap, where min‑range‑value must be ≤ max‑range‑value. To specify a single partition key, min‑range‑value and max‑range‑value must be equal.

target-table

The table to which the partitions are to be swapped. The target table cannot be the same as the staging table.

force-split

Optional Boolean argument, specifies whether to split ROS containers if the range of partition keys spans multiple containers or part of a single container:

true: Split ROS containers as needed.
false (default): Return with an error if ROS containers must be split to implement this operation.

Privileges

Non-superuser, one of the following:

Owner of source and target tables
Target and source tables: TRUNCATE, INSERT, SELECT

Requirements

The following attributes of both tables must be identical:

Column definitions, including NULL/NOT NULL constraints
Segmentation
Partition clause
Number of projections
Projection sort order
Primary and unique key constraints. However, the key constraints do not have to be identically enabled. For more information on constraints, see Constraints.

Note
If the target table has primary or unique key constraints enabled and copying or moving the partitions will insert duplicate key values into the target table, Vertica rolls back the operation.
Check constraints. For MOVE_PARTITIONS_TO_TABLE and COPY_PARTITIONS_TO_TABLE, Vertica enforces enabled check constraints on the target table only. For SWAP_PARTITIONS_BETWEEN_TABLES, Vertica enforces enabled check constraints on both tables. If there is a violation of an enabled check constraint, Vertica rolls back the operation.
Number and definitions of text indices.

Additionally, If access policies exist on the source table, the following must be true:

Access policies on both tables must be identical.
One of the following must be true:
- The executing user owns the target table.
- AccessPolicyManagementSuperuserOnly is set to true.

Restrictions

The following restrictions apply to the source and target tables:

If the source and target partitions are in different storage tiers, Vertica returns a warning but the operation proceeds. The partitions remain in their existing storage tier.
The target table cannot be immutable.
The following tables cannot be used as sources or targets:
- Temporary tables
- Virtual tables
- System tables
- External tables

Examples

See Swapping partitions.

12 - Privileges and access functions

This section contains functions for managing user and role privileges, and access policies.

12.1 - ENABLED_ROLE

Checks whether a Vertica user role is enabled, and returns true or false.

Checks whether a Vertica user role is enabled, and returns true or false. This function is typically used when you create access policies on database roles.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

ENABLED_ROLE ( 'role' )

Parameters

role: The role to evaluate.

Privileges

None

Examples

See:

12.2 - GET_PRIVILEGES_DESCRIPTION

Returns the effective privileges the current user has on an object, including explicit, implicit, inherited, and role-based privileges.

Because this meta-function only returns effective privileges, GET_PRIVILEGES_DESCRIPTION only returns privileges with fully-satisfied prerequisites. For a list of prerequisites for common operations, see Privileges required for common database operations.

For example, a user must have the following privileges to query a table:

Schema: USAGE
Table: SELECT

If user Brooke has SELECT privileges on table s1.t1 but lacks USAGE privileges on schema s1, Brooke cannot query the table, and GET_PRIVILEGES_DESCRIPTION does not return SELECT as a privilege for the table.

Note

Inherited privileges are not displayed if privilege inheritance is disabled at the database level with DisableInheritedPrivileges.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

GET_PRIVILEGES_DESCRIPTION( 'type', '[[database.]schema.]name' );

Parameters

type

Specifies an object type, one of the following:

database
table
schema
view
sequence
model
library
resource pool

[database.]schema

Specifies a database and schema, by default the current database and public, respectively.

name

Name of the target object

Privileges

None

Examples

In the following example, user Glenn has set the REPORTER role and wants to check his effective privileges on schema s1 and table s1.articles.

Table s1.articles inherits privileges from its schema (s1).
The REPORTER role has the following privileges:
- SELECT on schema s1
- INSERT WITH GRANT OPTION on table s1.articles
User Glenn has the following privileges:
- UPDATE and USAGE on schema s1.
- DELETE on table s1.articles.

GET_PRIVILEGES_DESCRIPTION returns the following effective privileges for Glenn on schema s1:

=> SELECT GET_PRIVILEGES_DESCRIPTION('schema', 's1');
   GET_PRIVILEGES_DESCRIPTION
--------------------------------
 SELECT, UPDATE, USAGE
(1 row)

GET_PRIVILEGES_DESCRIPTION returns the following effective privileges for Glenn on table s1.articles:


=> SELECT GET_PRIVILEGES_DESCRIPTION('table', 's1.articles');
   GET_PRIVILEGES_DESCRIPTION
--------------------------------
 INSERT*, SELECT, UPDATE, DELETE
(1 row)

12.3 - HAS_ROLE

Checks whether a Vertica user role is granted to the specified user or role, and returns true or false.

You can also query system tables ROLES, GRANTS, and USERS to obtain information on users and their role assignments. For details, see Viewing user roles.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Stable

Syntax

HAS_ROLE( [ 'grantee' ,] 'verify-role' );

Parameters

grantee: Valid only for superusers, specifies the name of a user or role to look up. If this argument is omitted, the function uses the current user name ( CURRENT_USER). If you specify a role, Vertica checks whether this role is granted to the role specified in verify-role.

Important
If a non-superuser supplies this argument, Vertica returns an error.
verify-role: Name of the role to verify for grantee.

Privileges

None

Examples

In the following example, a dbadmin user checks whether user MikeL is assigned the admnistrator role:

=> \c
You are now connected as user "dbadmin".
=> SELECT HAS_ROLE('MikeL', 'administrator');
 HAS_ROLE
----------
 t
(1 row)

User MikeL checks whether he has the regional_manager role:

=> \c - MikeL
You are now connected as user "MikeL".
=> SELECT HAS_ROLE('regional_manager');
 HAS_ROLE
----------
 f
(1 row)

The dbadmin grants the regional_manager role to the administrator role. On checking again, MikeL verifies that he now has the regional_manager role:

dbadmin=> \c
You are now connected as user "dbadmin".
dbadmin=> GRANT regional_manager to administrator;
GRANT ROLE
dbadmin=> \c - MikeL
You are now connected as user "MikeL".
dbadmin=> SELECT HAS_ROLE('regional_manager');
 HAS_ROLE
----------
 t
(1 row)

12.4 - RELEASE_SYSTEM_TABLES_ACCESS

Enables non-superuser access to all system tables.

Allows non-superusers to access all non-SUPERUSER_ONLY system tables. After you call this function, Vertica ignores the IS_ACCESSIBLE_DURING_LOCKDOWN setting in table SYSTEM_TABLES. To restrict non-superusers access to system tables, call RESTRICT_SYSTEM_TABLES_ACCESS.

By default, the database behaves as though RELEASE_SYSTEM_TABLES_ACCESS() was called. That is, non-superusers have access to all non-SUPERUSER_ONLY system tables.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

RELEASE_SYSTEM_TABLES_ACCESS()

Privileges

Superuser

Examples

By default, non-superuser Alice has access to client_auth and disk_storage. She also has access to replication_status because she was granted the privilege by the dbadmin:

=> SELECT table_name, is_superuser_only, is_accessible_during_lockdown FROM system_tables WHERE table_name='disk_storage' OR table_name='database_backups' OR table_name='replication_status' OR table_name='client_auth';
     table_name     | is_superuser_only | is_accessible_during_lockdown
--------------------+-------------------+-------------------------------
 client_auth        | f                 | t
 disk_storage       | f                 | f
 database_backups   | t                 | f
 replication_status | t                 | t
(4 rows)

The dbadmin calls RESTRICT_SYSTEM_TABLES_ACCESS:

=> SELECT RESTRICT_SYSTEM_TABLES_ACCESS();
                       RESTRICT_SYSTEM_TABLES_ACCESS
----------------------------------------------------------------------------
 Dropped grants to public on non-accessible during lockdown system tables.

(1 row)

Alice loses access to disk_storage, but she retains access to client_auth and replication_status because their IS_ACCESSIBLE_DURING_LOCKDOWN fields are true:

=> SELECT storage_status FROM disk_storage;
ERROR 4367:  Permission denied for relation disk_storage

The dbadmin calls RELEASE_SYSTEM_TABLES_ACCESS(), restoring Alice's access to disk_storage:

=> SELECT RELEASE_SYSTEM_TABLES_ACCESS();
              RELEASE_SYSTEM_TABLES_ACCESS
--------------------------------------------------------
 Granted SELECT privileges on system tables to public.

(1 row)

12.5 - RESTRICT_SYSTEM_TABLES_ACCESS

Checks system table SYSTEM_TABLES to determine which system tables non-superusers can access.

Prevents non-superusers from accessing tables that have the IS_ACCESSIBLE_DURING_LOCKDOWN flag set to false.

To enable non-superuser access to system tables restricted by this function, call RELEASE_SYSTEM_TABLES_ACCESS.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

RESTRICT_SYSTEM_TABLES_ACCESS()

Privileges

Superuser

Examples

By default, client_auth and disk_storage tables are accessible to all users, but only the former is accessible after RESTRICT_SYSTEM_TABLES_ACCESS() is called. Non-superusers never have access to database_backups and replication_status unless explicitly granted the privilege by the dbadmin:

=> SELECT table_name, is_superuser_only, is_accessible_during_lockdown FROM system_tables WHERE table_name='disk_storage' OR table_name='database_backups' OR table_name='replication_status' OR table_name='client_auth';
     table_name     | is_superuser_only | is_accessible_during_lockdown
--------------------+-------------------+-------------------------------
 client_auth        | f                 | t
 disk_storage       | f                 | f
 database_backups   | t                 | f
 replication_status | t                 | t
(4 rows)

The dbadmin then calls RESTRICT_SYSTEM_TABLES_ACCESS():

=> SELECT RESTRICT_SYSTEM_TABLES_ACCESS();
                       RESTRICT_SYSTEM_TABLES_ACCESS
----------------------------------------------------------------------------
 Dropped grants to public on non-accessible during lockdown system tables.

(1 row)

Bob loses access to disk_storage, but retains access to client_auth because its IS_ACCESSIBLE_DURING_LOCKDOWN field is true:

=> SELECT storage_status FROM disk_storage;
ERROR 4367:  Permission denied for relation disk_storage

=> SELECT auth_oid FROM client_auth;
     auth_oid
-------------------
 45035996273705106
 45035996273705110
 45035996273705114
(3 rows)

13 - Projection functions

This section contains projection management functions specific to Vertica.

13.1 - CLEAR_PROJECTION_REFRESHES

Clears information projection refresh history from system table PROJECTION_REFRESHES.

Clears projection refresh history from the PROJECTION_REFRESHES system table. PROJECTION_REFRESHES records information about successful and unsuccessful refresh operations.

CLEAR_PROJECTION_REFRESHES removes information only for refresh operations that are complete, as indicated by the IS_EXECUTING column in PROJECTION_REFRESHES.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CLEAR_PROJECTION_REFRESHES()

Privileges

Superuser

Examples

=> SELECT CLEAR_PROJECTION_REFRESHES();
 CLEAR_PROJECTION_REFRESHES
----------------------------
 CLEAR
(1 row)

13.2 - EVALUATE_DELETE_PERFORMANCE

Evaluates projections for potential DELETE and UPDATE performance issues.

Evaluates projections for potential DELETE and UPDATE performance issues. If Vertica finds any issues, it issues a warning message. When evaluating multiple projections, EVALUATE_DELETE_PERFORMANCE returns up to ten projections with issues, and the name of a table that lists all issues that it found.

Note

EVALUATE_DELETE_PERFORMANCE returns messages that specifically reference delete performance. Keep in mind, however, that delete and update operations benefit equally from the same optimizations.

For information on resolving delete and update performance issues, see Optimizing DELETE and UPDATE.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

EVALUATE_DELETE_PERFORMANCE ( ['[[database.]schema.]scope'] )

Parameters

`[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

scope

Specifies the projections to evaluate, one of the following:

[table.]projection
Evaluate projection. For example:

SELECT EVALUATE_DELETE_PERFORMANCE('store.store_orders_fact.store_orders_fact_b1');

table
Specifies to evaluate all projections of table. For example:

SELECT EVALUATE_DELETE_PERFORMANCE('store.store_orders_fact');

If you supply no arguments, EVALUATE_DELETE_PERFORMANCE evaluates all projections that you can access. Depending on the size of your database, this can incur considerable overhead.

Privileges

Non-superuser: SELECT privilege on the anchor table

Examples

EVALUATE_DELETE_PERFORMANCE evaluates all projections of table example for potential DELETE and UPDATE performance issues.

=> create table example (A int, B int,C int);
CREATE TABLE
=> create projection one_sort (A,B,C) as (select A,B,C from example) order by A;
CREATE PROJECTION
=> create projection two_sort (A,B,C) as (select A,B,C from example) order by A,B;
CREATE PROJECTION
=> select evaluate_delete_performance('example');
            evaluate_delete_performance
---------------------------------------------------
 No projection delete performance concerns found.
(1 row)

The previous example show that the two projections one_sort and two_sort have no inherent structural issues that might cause poor DELETE performance. However, the data contained within the projection can create potential delete issues if the sorted columns do not uniquely identify a row or small number of rows.

In the following example, Perl is used to populate the table with data using a nested series of loops:

The inner loop populates column C.
The middle loop populates column B.
The outer loop populates column A.

The result is column A contains only three distinct values (0, 1, and 2), while column B slowly varies between 20 and 0 and column C changes in each row:

=> \! perl -e 'for ($i=0; $i<3; $i++) { for ($j=0; $j<21; $j++) { for ($k=0; $k<19; $k++) { printf "%d,%d,%d\n", $i,$j,$k;}}}' | /opt/vertica/bin/vsql -c "copy example from stdin delimiter ',' direct;"
Password:
=> select * from example;
 A | B  | C
---+----+----
 0 | 20 | 18
 0 | 20 | 17
 0 | 20 | 16
 0 | 20 | 15
 0 | 20 | 14
 0 | 20 | 13
 0 | 20 | 12
 0 | 20 | 11
 0 | 20 | 10
 0 | 20 |  9
 0 | 20 |  8
 0 | 20 |  7
 0 | 20 |  6
 0 | 20 |  5
 0 | 20 |  4
 0 | 20 |  3
 0 | 20 |  2
 0 | 20 |  1
 0 | 20 |  0
 0 | 19 | 18
 ...
 2 |  1 |  0
 2 |  0 | 18
 2 |  0 | 17
 2 |  0 | 16
 2 |  0 | 15
 2 |  0 | 14
 2 |  0 | 13
 2 |  0 | 12
 2 |  0 | 11
 2 |  0 | 10
 2 |  0 |  9
 2 |  0 |  8
 2 |  0 |  7
 2 |  0 |  6
 2 |  0 |  5
 2 |  0 |  4
 2 |  0 |  3
 2 |  0 |  2
 2 |  0 |  1
 2 |  0 |  0
=> SELECT COUNT (*) FROM example;
 COUNT
-------
  1197
(1 row)
=> SELECT COUNT (DISTINCT A) FROM example;
 COUNT
-------
     3
(1 row)

EVALUATE_DELETE_PERFORMANCE is run against the projections again to determine whether the data within the projections causes any potential DELETE performance issues. Projection one_sort has potential delete issues as it only sorts on column A which has few distinct values. Each value in the sort column corresponds to many rows in the projection, which can adversely impact DELETE performance. In contrast, projection two_sort is sorted on columns A and B, where each combination of values in the two sort columns identifies just a few rows, so deletes can be performed faster:


=> select evaluate_delete_performance('example');
            evaluate_delete_performance
---------------------------------------------------
 The following projections exhibit delete performance concerns:
        "public"."one_sort_b1"
        "public"."one_sort_b0"
See v_catalog.projection_delete_concerns for more details.

=> \x
Expanded display is on.
dbadmin=> select * from projection_delete_concerns;
-[ RECORD 1 ]------+------------------------------------------------------------------------------------------------------------------------------------------------------------
projection_id      | 45035996273878562
projection_schema  | public
projection_name    | one_sort_b1
creation_time      | 2019-06-17 13:59:03.777085-04
last_modified_time | 2019-06-17 14:00:27.702223-04
comment            | The squared number of rows matching each sort key is about 159201 on average.
-[ RECORD 2 ]------+------------------------------------------------------------------------------------------------------------------------------------------------------------
projection_id      | 45035996273878548
projection_schema  | public
projection_name    | one_sort_b0
creation_time      | 2019-06-17 13:59:03.777279-04
last_modified_time | 2019-06-17 13:59:03.777279-04
comment            | The squared number of rows matching each sort key is about 159201 on average.

If you omit supplying an argument to EVALUATE_DELETE_PERFORMANCE, it evaluates all projections that you can access:

=> select evaluate_delete_performance();
                          evaluate_delete_performance
---------------------------------------------------------------------------
 The following projections exhibit delete performance concerns:
        "public"."one_sort_b0"
        "public"."one_sort_b1"
See v_catalog.projection_delete_concerns for more details.
(1 row)

13.3 - GET_PROJECTION_SORT_ORDER

Returns the order of columns in a projection's ORDER BY clause.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

GET_PROJECTION_SORT_ORDER( '[[database.]schema.]projection' );

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
projection: The target projection.

Privileges

Non-superuser: SELECT privilege on the anchor table

Examples

=> SELECT get_projection_sort_order ('store_orders_super');
                                 get_projection_sort_order
--------------------------------------------------------------------------------------------
 public.store_orders_super [Sort Cols: "order_no", "order_date", "shipper", "ship_date"]

(1 row)

13.4 - GET_PROJECTION_STATUS

Returns information relevant to the status of a :.

Returns information relevant to the status of a projection:

The current K-safety status of the database
The number of nodes in the database
Whether the projection is segmented
The number and names of buddy projections
Whether the projection is safe
Whether the projection is up to date
Whether statistics have been computed for the projection

Use GET_PROJECTION_STATUS to monitor the progress of a projection data refresh.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

GET_PROJECTION_STATUS ( '[[database.]schema.]projection' );

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
projection: The projection for which to display status.

Examples

=> SELECT GET_PROJECTION_STATUS('public.customer_dimension_site01');
                                     GET_PROJECTION_STATUS
-----------------------------------------------------------------------------------------------
 Current system K is 1.
# of Nodes: 4.
public.customer_dimension_site01 [Segmented: No] [Seg Cols: ] [K: 3] [public.customer_dimension_site04, public.customer_dimension_site03,
public.customer_dimension_site02]
[Safe: Yes] [UptoDate: Yes][Stats: Yes]

13.5 - GET_PROJECTIONS

Returns contextual and projection information about projections of the specified anchor table.

Contextual information

Database K-safety
Number of database nodes
Number of projections for this table

Projection data

For each projection, specifies:

All buddy projections
Whether it is segmented
Whether it is safe
Whether it is up-to-date.

You can also use GET_PROJECTIONS to monitor the progress of a projection data refresh.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

GET_PROJECTIONS ( '[[database.]schema-name.]table' )

Parameters

`[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table: Anchor table of the projections to list.

Privileges

None

Examples

The following example gets information about projections for VMart table store.store_dimension:

=> SELECT GET_PROJECTIONS('store.store_dimension');
-[ RECORD 1 ]---+
GET_PROJECTIONS | Current system K is 1.
# of Nodes: 3.
Table store.store_dimension has 2 projections.

Projection Name: [Segmented] [Seg Cols] [# of Buddies] [Buddy Projections] [Safe] [UptoDate] [Stats]
----------------------------------------------------------------------------------------------------
store.store_dimension_b1 [Segmented: Yes] [Seg Cols: "store.store_dimension.store_key"] [K: 1] [store.store_dimension_b0] [Safe: Yes] [UptoDate: Yes] [Stats: RowCounts]
store.store_dimension_b0 [Segmented: Yes] [Seg Cols: "store.store_dimension.store_key"] [K: 1] [store.store_dimension_b1] [Safe: Yes] [UptoDate: Yes] [Stats: RowCounts]

13.6 - PURGE_PROJECTION

PURGE_PROJECTION can use significant disk space while purging the data.

Permanently removes deleted data from physical storage so disk space can be reused. You can purge historical data up to and including the Ancient History Mark epoch.

Caution

PURGE_PROJECTION can use significant disk space while purging the data.

See PURGE for details about purge operations.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

PURGE_PROJECTION ( '[[database.]schema.]projection' )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
projection: The projection to purge.

Privileges

Table owner
USAGE privilege on schema

Examples

The following example purges all historical data in projection tbl_p that precedes the Ancient History Mark epoch.

=> CREATE TABLE tbl (x int, y int);
CREATE TABLE
=> INSERT INTO tbl VALUES(1,2);
 OUTPUT
--------
      1
(1 row)

=> INSERT INTO tbl VALUES(3,4);
 OUTPUT
--------
      1
(1 row)

dbadmin=> COMMIT;
COMMIT
=> CREATE PROJECTION tbl_p AS SELECT x FROM tbl UNSEGMENTED ALL NODES;
WARNING 4468: Projection <public.tbl_p> is not available for query processing.
Execute the select start_refresh() function to copy data into this projection.
The projection must have a sufficient number of buddy projections and all nodes must be up before starting a refresh
CREATE PROJECTION
=> SELECT START_REFRESH();
             START_REFRESH
----------------------------------------
 Starting refresh background process.
=> DELETE FROM tbl WHERE x=1;
 OUTPUT
--------
      1
(1 row)

=> COMMIT;
COMMIT
=> SELECT MAKE_AHM_NOW();
         MAKE_AHM_NOW
-------------------------------
 AHM set (New AHM Epoch: 9066)
(1 row)

=> SELECT PURGE_PROJECTION ('tbl_p');
 PURGE_PROJECTION
-------------------
 Projection purged
(1 row)

13.7 - REFRESH

Synchronously refreshes one or more table projections in the foreground, and updates the PROJECTION_REFRESHES system table.

Synchronously refreshes one or more table projections in the foreground, and updates the PROJECTION_REFRESHES system table. If you run REFRESH with no arguments, it refreshes all projections that contain stale data.

To understand projection refreshing in detail, see Refreshing projections.

If a refresh would violate a table or schema disk quota, the operation fails. For more information, see Disk quotas.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

REFRESH ( [ '[[database.]schema.]table[,...]' ] )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table: The anchor table of the projections to refresh. If you specify multiple tables, REFRESH attempts to refresh them in parallel. Such calls are part of the Database Designer deployment (and deployment script).

Returns

Note

If REFRESH does not refresh any projections, it returns a header string with no results.

Column	Returns
`Projection Name`	The projection targeted for refresh.
`Anchor Table`	The projection's associated anchor table.
`Status`	Projections' refresh status: `queued`: Queued for refresh. `refreshing`: Refresh is in process. `refreshed`: Refresh successfully completed. `failed`: Refresh did not successfully complete.
`Refresh Method`	Method used to refresh the projection.
`Error Count`	Number of times a refresh failed for the projection.
`Duration (sec)`	How long (in seconds) the projection refresh ran.

Privileges

Superuser
Owner of the specified tables

Refresh methods

Vertica can refresh a projection from one of its buddies, if one is available. In this case, the target projection gets the source buddy's historical data. Otherwise, the projection is refreshed from scratch with data of the latest epoch at the time of the refresh operation. In this case, the projection cannot participate in historical queries on any epoch that precedes the refresh operation.

Vertica can perform incremental refreshes when the following conditions are met:

The table being refreshed is partitioned.
The table does not contain any unpartitioned data.
The operation is a full projection refresh (not a partition range projection refresh).

In an incremental refresh, the refresh operation first loads data from the partition with the highest range of keys. After refreshing this partition, Vertica begins to refresh the partition with next highest partition range. This process continues until all projection partitions are refreshed. While the refresh operation is in progress, projection partitions that have completed the refresh process become available to process query requests.

The method used to refresh a given projection is recorded in the REFRESH_METHOD column of the PROJECTION_REFRESHES system table.

Examples

The following example refreshes the projections in two tables:

=> SELECT REFRESH('t1, t2');
                                             REFRESH
----------------------------------------------------------------------------------------
Refresh completed with the following outcomes:

Projection Name: [Anchor Table] [Status] [Refresh Method] [Error Count] [Duration (sec)]
----------------------------------------------------------------------------------------

"public"."t1_p": [t1] [refreshed] [scratch] [0] [0]"public"."t2_p": [t2] [refreshed] [scratch] [0] [0]

In the following example, only the projection on one table was refreshed:

=> SELECT REFRESH('allow, public.deny, t');
                                               REFRESH
----------------------------------------------------------------------------------------

Refresh completed with the following outcomes:

Projection Name: [Anchor Table] [Status] [Refresh Method] [Error Count] [Duration (sec)]
----------------------------------------------------------------------------------------
"n/a"."n/a": [n/a] [failed: insufficient permissions on table "allow"] [] [1] [0]
"n/a"."n/a": [n/a] [failed: insufficient permissions on table "public.deny"] [] [1] [0]
"public"."t_p1": [t] [refreshed] [scratch] [0] [0]

13.8 - REFRESH_COLUMNS

Refreshes table columns that are defined with the constraint SET USING or DEFAULT USING.

Refreshes table columns that are defined with the constraint SET USING or DEFAULT USING. All refresh operations associated with a call to REFRESH_COLUMNS belong to the same transaction. Thus, all tables and columns specified by REFRESH_COLUMNS must be refreshed; otherwise, the entire operation is rolled back.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

REFRESH_COLUMNS ( 'table-list', '[column-list]'
   [, '[refresh-mode]' [, min-partition-key, max-partition-key [, force-split] ]
)

Arguments

table-list

A comma-delimited list of the tables to refresh: [[database.]schema.]table[,...]

If you specify multiple tables, refresh-mode must be set to REBUILD.

column-list

A comma-delimited list of columns to refresh: [[[database.]schema.]table.]column[,...] or [[database.]schema.]table.*, where asterisk (*) means to refresh all SET USING/DEFAULT USING columns in the table. For example:

SELECT REFRESH_COLUMNS ('t1, t2', 't1.*, t2.b', 'REBUILD');

If column-list is set to an empty string (''), REFRESH_COLUMNS refreshes all SET USING/DEFAULT USING columns in the specified tables.

The following requirements apply:

All specified columns must have a SET USING or DEFAULT USING constraint.
If REFRESH_COLUMNS specifies multiple tables, all column names must be qualified by their table names. If the target tables span multiple schemas, all column names must be fully qualified by their schema and table names. For example:
```
SELECT REFRESH_COLUMNS ('t1, t2', 't1.a, t2.b', 'REBUILD');
```

If you specify a database, it must be the current database.

refresh-mode

Specifies how to refresh SET USING columns:

UPDATE (default): Marks original rows as deleted and replaces them with new rows. In order to save these updates, you must issue a COMMIT statement.
REBUILD: Replaces all data in the specified columns. The rebuild operation is auto-committed.

If you specify multiple tables, you must explicitly specify REBUILD mode.

In both cases, REFRESH_COLUMNS returns an error if any SET USING column is defined as a primary or unique key in a table that enforces those constraints.

See REBUILD Mode Restrictions for limitations on using the REBUILD option.

min-partition-key, max-partition-key

Qualifies REBUILD mode, limiting the rebuild operation to one or more partitions. To specify a range of partitions, max-partition-key must be greater than min-partition-key. To update one partition, the two arguments must be equal.

The following requirements apply:

The function can specify only one table to refresh.
The table must be partitioned on the specified keys.

You can use these arguments to refresh columns with recently loaded data—that is, data in the latest partitions. Using this option regularly can significantly minimize the overhead otherwise incurred by rebuilding entire columns in a large table.

See Partition-based REBUILD below for details.

force-split

Boolean, whether to split ROS containers if the range of partition keys spans multiple containers or part of a single container:

true (default): Split ROS containers as needed.
false: Return with an error if ROS containers must be split to implement this operation.

Privileges

Schemas of queried and flattened tables: USAGE
Queried table: SELECT
Flattened table: SELECT, UPDATE

UPDATE versus REBUILD modes

In general, UPDATE mode is a better choice when changes to SET USING column data are confined to a relatively small number of rows. Use REBUILD mode when a significant amount of SET USING column data is stale and must be updated. It is generally good practice to call REFRESH_COLUMNS with REBUILD on any new SET USING column—for example, to populate a SET USING column after adding it with ALTER TABLE...ADD COLUMN.

REBUILD mode restrictions

If you call REFRESH_COLUMNS on a SET USING column and specify the refresh mode as REBUILD, Vertica returns an error if the column is specified in any of the following:

Table's partition key
Unsegmented projection
Projection with expressions, or any live aggregate projection that invokes a user-defined transform function (UDTF)
Sort order or segmentation of any projection
Any projection that omits an anchor table column that is referenced in the column's SET USING expression
GROUPED clause of any projection

Partition-based REBUILD operations

If a flattened table is partitioned, you can reduce the overhead of calling REFRESH_COLUMNS in REBUILD mode, by specifying one or more partition keys. Doing so limits the rebuild operation to the specified partitions. For example, table public.orderFact is defined with SET USING column cust_name. This table is partitioned on column order_date, where the partition clause invokes Vertica function CALENDAR_HIERARCHY_DAY. Thus, you can call REFRESH_COLUMNS on specific time-delimited partitions of this table—in this case, on orders over the last two months:

=> SELECT REFRESH_COLUMNS ('public.orderFact',
                        'cust_name',
                        'REBUILD',
                        TO_CHAR(ADD_MONTHS(current_date, -2),'YYYY-MM')||'-01',
                        TO_CHAR(LAST_DAY(ADD_MONTHS(current_date, -1))));
      REFRESH_COLUMNS
---------------------------
 refresh_columns completed
(1 row)

Rewriting SET USING queries

When you call REFRESH_COLUMNS on a flattened table's SET USING (or DEFAULT USING) column, it executes the SET USING query by joining the target and source tables. By default, the source table is always the inner table of the join. In most cases, cardinality of the source table is less than the target table, so REFRESH_COLUMNS executes the join efficiently.

Occasionally—notably, when you call REFRESH_COLUMNS on a partitioned table—the source table can be larger than the target table. In this case, performance of the join operation can be suboptimal.

You can address this issue by enabling configuration parameter RewriteQueryForLargeDim. When enabled (1), Vertica rewrites the query, by reversing the inner and outer join between the target and source tables.

Important

Enable this parameter only if the SET USING source data is in a table that is larger than the target table. If the source data is in a table smaller than the target table, then enabling RewriteQueryForLargeDim can adversely affect refresh performance.

Examples

See Flattened table example and DEFAULT versus SET USING.

13.9 - START_REFRESH

Refreshes projections in the current schema with the latest data of their respective.

Refreshes projections in the current schema with the latest data of their respective anchor tables. START_REFRESH runs asynchronously in the background, and updates the PROJECTION_REFRESHES system table. This function has no effect if a refresh is already running.

To refresh only projections of a specific table, use REFRESH. When you deploy a design through Database Designer, it automatically refreshes its projections.

If a refresh would violate a table or schema disk quota, the operation fails. For more information, see Disk quotas.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

START_REFRESH()

Privileges

None

Requirements

All nodes must be up.

Refresh methods

Vertica can perform incremental refreshes when the following conditions are met:

The table being refreshed is partitioned.
The table does not contain any unpartitioned data.
The operation is a full projection refresh (not a partition range projection refresh).

The method used to refresh a given projection is recorded in the REFRESH_METHOD column of the PROJECTION_REFRESHES system table.

Examples

=> SELECT START_REFRESH();
             START_REFRESH
----------------------------------------
 Starting refresh background process.
(1 row)

14 - Session functions

This section contains session management functions specific to Vertica.

See also the SQL system table V_MONITOR.SESSIONS.

14.1 - CANCEL_REFRESH

Cancels refresh-related internal operations initiated by START_REFRESH and REFRESH.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CANCEL_REFRESH()

Privileges

None

Notes

Refresh tasks run in a background thread in an internal session, so you cannot use INTERRUPT_STATEMENT to cancel those statements. Instead, use CANCEL_REFRESH to cancel statements that are run by refresh-related internal sessions.
Run CANCEL_REFRESH() on the same node on which START_REFRESH() was initiated.
CANCEL_REFRESH() cancels the refresh operation running on a node, waits for the cancelation to complete, and returns SUCCESS.
Only one set of refresh operations runs on a node at any time.

Examples

Cancel a refresh operation executing in the background.

=> SELECT START_REFRESH();
             START_REFRESH
----------------------------------------
Starting refresh background process.
(1 row)
=> SELECT CANCEL_REFRESH();
              CANCEL_REFRESH
----------------------------------------
Stopping background refresh process.
(1 row)

14.2 - CLOSE_ALL_SESSIONS

Closes all external sessions except the one that issues this function.

Closes all external sessions except the one that issues this function. Call this function before shutting down the Vertica database.

Vertica closes sessions asynchronously, so another session can open before this function returns. In this case, reissue this function. To view the status of all open sessions, query system table SESSIONS.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CLOSE_ALL_SESSIONS()

Privileges

Non-superuser: None to close your own session

Examples

Two user sessions are open on separate nodes:

=> SELECT * FROM sessions;
-[ RECORD 1 ]--------------+----------------------------------------------------
node_name                  | v_vmartdb_node0001
user_name                  | dbadmin
client_hostname            | 127.0.0.1:52110
client_pid                 | 4554
login_timestamp            | 2011-01-03 14:05:40.252625-05
session_id                 | stress04-4325:0x14
client_label               |
transaction_start          | 2011-01-03 14:05:44.325781
transaction_id             | 45035996273728326
transaction_description    | user dbadmin (select * from sessions;)
statement_start            | 2011-01-03 15:36:13.896288
statement_id               | 10
last_statement_duration_us | 14978
current_statement          | select * from sessions;
ssl_state                  | None
authentication_method      | Trust
-[ RECORD 2 ]--------------+----------------------------------------------------
node_name                  | v_vmartdb_node0002
user_name                  | dbadmin
client_hostname            | 127.0.0.1:57174
client_pid                 | 30117
login_timestamp            | 2011-01-03 15:33:00.842021-05
session_id                 | stress05-27944:0xc1a
client_label               |
transaction_start          | 2011-01-03 15:34:46.538102
transaction_id             | -1
transaction_description    | user dbadmin (COPY Mart_Fact FROM '/data/mart_Fact.tbl'
                             DELIMITER '|' NULL '\\n';)
statement_start            | 2011-01-03 15:34:46.538862
statement_id               |
last_statement_duration_us | 26250
current_statement          | COPY Mart_Fact FROM '/data/Mart_Fact.tbl' DELIMITER '|'
                             NULL '\\n';
ssl_state                  | None
authentication_method      | Trust
-[ RECORD 3 ]--------------+----------------------------------------------------
node_name                  | v_vmartdb_node0003
user_name                  | dbadmin
client_hostname            | 127.0.0.1:56367
client_pid                 | 1191
login_timestamp            | 2011-01-03 15:31:44.939302-05
session_id                 | stress06-25663:0xbec
client_label               |
transaction_start          | 2011-01-03 15:34:51.05939
transaction_id             | 54043195528458775
transaction_description    | user dbadmin (COPY Mart_Fact FROM '/data/Mart_Fact.tbl'
                             DELIMITER '|' NULL '\\n' DIRECT;)
statement_start            | 2011-01-03 15:35:46.436748
statement_id               |
last_statement_duration_us | 1591403
current_statement          | COPY Mart_Fact FROM '/data/Mart_Fact.tbl' DELIMITER '|'
                             NULL '\\n' DIRECT;
ssl_state                  | None
authentication_method      | Trust

Close all sessions:

=> \x
Expanded display is off.
=> SELECT CLOSE_ALL_SESSIONS();
                           CLOSE_ALL_SESSIONS
-------------------------------------------------------------------------
 Close all sessions command sent. Check v_monitor.sessions for progress.
(1 row)

Session contents after issuing CLOSE_ALL_SESSIONS:

=> SELECT * FROM SESSIONS;
-[ RECORD 1 ]--------------+----------------------------------------
node_name                  | v_vmartdb_node0001
user_name                  | dbadmin
client_hostname            | 127.0.0.1:52110
client_pid                 | 4554
login_timestamp            | 2011-01-03 14:05:40.252625-05
session_id                 | stress04-4325:0x14
client_label               |
transaction_start          | 2011-01-03 14:05:44.325781
transaction_id             | 45035996273728326
transaction_description    | user dbadmin (SELECT * FROM sessions;)
statement_start            | 2011-01-03 16:19:56.720071
statement_id               | 25
last_statement_duration_us | 15605
current_statement          | SELECT * FROM SESSIONS;
ssl_state                  | None
authentication_method      | Trust

14.3 - CLOSE_SESSION

Interrupts the specified external session, rolls back the current transaction if any, and closes the socket.

Interrupts the specified external session, rolls back the current transaction if any, and closes the socket. You can only close your own session.

It might take some time before a session is closed. To view the status of all open sessions, query the system table SESSIONS.

For detailed information about session management options, see Managing sessions.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CLOSE_SESSION ( 'sessionid')

Parameters

sessionid: A string that specifies the session to close. This identifier is unique within the cluster at any point in time but can be reused when the session closes.

Privileges

None

Examples

User session opened. Record 2 shows the user session running a COPY DIRECT statement.

=> SELECT * FROM sessions;
-[ RECORD 1 ]--------------+-----------------------------------------------
node_name                  | v_vmartdb_node0001
user_name                  | dbadmin
client_hostname            | 127.0.0.1:52110
client_pid                 | 4554
login_timestamp            | 2011-01-03 14:05:40.252625-05
session_id                 | stress04-4325:0x14
client_label               |
transaction_start          | 2011-01-03 14:05:44.325781
transaction_id             | 45035996273728326
transaction_description    | user dbadmin (SELECT * FROM sessions;)
statement_start            | 2011-01-03 15:36:13.896288
statement_id               | 10
last_statement_duration_us | 14978
current_statement          | select * from sessions;
ssl_state                  | None
authentication_method      | Trust
-[ RECORD 2 ]--------------+-----------------------------------------------
node_name                  | v_vmartdb_node0002
user_name                  | dbadmin
client_hostname            | 127.0.0.1:57174
client_pid                 | 30117
login_timestamp            | 2011-01-03 15:33:00.842021-05
session_id                 | stress05-27944:0xc1a
client_label               |
transaction_start          | 2011-01-03 15:34:46.538102
transaction_id             | -1
transaction_description    | user dbadmin (COPY ClickStream_Fact FROM
                             '/data/clickstream/1g/ClickStream_Fact.tbl'
                             DELIMITER '|' NULL '\\n' DIRECT;)
statement_start            | 2011-01-03 15:34:46.538862
statement_id               |
last_statement_duration_us | 26250
current_statement          | COPY ClickStream_Fact FROM '/data/clickstream
                             /1g/ClickStream_Fact.tbl' DELIMITER '|' NULL
                             '\\n' DIRECT;
ssl_state                  | None
authentication_method      | Trust

Close user session stress05-27944:0xc1a

=> \x
Expanded display is off.
=> SELECT CLOSE_SESSION('stress05-27944:0xc1a');
                           CLOSE_SESSION
--------------------------------------------------------------------
 Session close command sent. Check v_monitor.sessions for progress.
(1 row)

Query the sessions table again for current status, and you can see that the second session has been closed:

=> SELECT * FROM SESSIONS;
-[ RECORD 1 ]--------------+--------------------------------------------
node_name                  | v_vmartdb_node0001
user_name                  | dbadmin
client_hostname            | 127.0.0.1:52110
client_pid                 | 4554
login_timestamp            | 2011-01-03 14:05:40.252625-05
session_id                 | stress04-4325:0x14
client_label               |
transaction_start          | 2011-01-03 14:05:44.325781
transaction_id             | 45035996273728326
transaction_description    | user dbadmin (select * from SESSIONS;)
statement_start            | 2011-01-03 16:12:07.841298
statement_id               | 20
last_statement_duration_us | 2099
current_statement          | SELECT * FROM SESSIONS;
ssl_state                  | None
authentication_method      | Trust

14.4 - CLOSE_USER_SESSIONS

Stops the session for a user, rolls back any transaction currently running, and closes the connection.

Stops the session for a user, rolls back any transaction currently running, and closes the connection. To determine the status of the sessions to close, query the SESSIONS table.

Note

Running this function on your own sessions leaves one session running.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CLOSE_USER_SESSIONS ( 'user-name' )

Parameters

user-name: Specifies the user whose sessions are to be closed. If you specify your own user name, Vertica closes all sessions except the one in which you issue this function.

Privileges

DBADMIN

Examples

This example closes all active session for user u1:

=> SELECT close_user_sessions('u1');

14.5 - GET_NUM_ACCEPTED_ROWS

Returns the number of rows loaded into the database for the last completed load for the current session.

Returns the number of rows loaded into the database for the last completed load for the current session. GET_NUM_ACCEPTED_ROWS is a meta-function. Do not use it as a value in an INSERT query.

The number of accepted rows is not available for a load that is currently in process. Check the LOAD_STREAMS system table for its status.

This meta-function supports loads from STDIN, COPY LOCAL from a Vertica client, or a single file on the initiator. You cannot use GET_NUM_ACCEPTED_ROWS for multi-node loads.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

GET_NUM_ACCEPTED_ROWS();

Privileges

None

Note

The data regarding accepted rows from the last load during the current session does not persist, and is lost when you initiate a new load.

Examples

This examples shows the number of accepted rows from the vmart_load_data.sql meta-command.

=> \i vmart_load_data.sql;
=> SELECT GET_NUM_ACCEPTED_ROWS ();
GET_NUM_ACCEPTED_ROWS
-----------------------
300000
(1 row)

14.6 - GET_NUM_REJECTED_ROWS

Returns the number of rows that were rejected during the last completed load for the current session.

Returns the number of rows that were rejected during the last completed load for the current session. GET_NUM_REJECTED_ROWS is a meta-function. Do not use it as a value in an INSERT query.

Rejected row information is unavailable for a load that is currently running. The number of rejected rows is not available for a load that is currently in process. Check the LOAD_STREAMS system table for its status.

This meta-function supports loads from STDIN, COPY LOCAL from a Vertica client, or a single file on the initiator. You cannot use GET_NUM_REJECTED_ROWS for multi-node loads.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

GET_NUM_REJECTED_ROWS();

Privileges

None

Note

The data regarding rejected rows from the last load during the current session does not persist, and is dropped when you initiate a new load.

Examples

This example shows the number of rejected rows from the vmart_load_data.sql meta-command.

=>  \i vmart_load_data.sql
=> SELECT GET_NUM_REJECTED_ROWS ();
GET_NUM_REJECTED_ROWS
-----------------------
0
(1 row)

14.7 - INTERRUPT_STATEMENT

Interrupts the specified statement in a user session, rolls back the current transaction, and writes a success or failure message to the log file.

Sessions can be interrupted during statement execution. Only statements run by user sessions can be interrupted.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

INTERRUPT_STATEMENT( 'session-id', statement-id)

Parameters

session-id: Identifies the session to interrupt. This identifier is unique within the cluster at any point in time.
statement-id: Identifies the statement to interrupt. If the *statement-id* is valid, the statement can be interrupted and INTERRUPT_STATEMENT returns a success message. Otherwise the system returns an error.

Privileges

Superuser

Messages

The following list describes messages you might encounter:

Message	Meaning
`Statement interrupt sent. Check SESSIONS for progress.`	This message indicates success.
`Session <id> could not be successfully interrupted: session not found.`	The session ID argument to the interrupt command does not match a running session.
`Session <id> could not be successfully interrupted: statement not found.`	The statement ID does not match (or no longer matches) the ID of a running statement (if any).
`No interruptible statement running`	The statement is DDL or otherwise non-interruptible.
`Internal (system) sessions cannot be interrupted.`	The session is internal, and only statements run by external sessions can be interrupted.

Examples

Two user sessions are open. RECORD 1 shows user session running SELECT FROM SESSION, and RECORD 2 shows user session running COPY DIRECT:

=> SELECT * FROM SESSIONS;
-[ RECORD 1 ]--------------+----------------------------------------------------
node_name                  | v_vmartdb_node0001
user_name                  | dbadmin
client_hostname            | 127.0.0.1:52110
client_pid                 | 4554
login_timestamp            | 2011-01-03 14:05:40.252625-05
session_id                 | stress04-4325:0x14
client_label               |
transaction_start          | 2011-01-03 14:05:44.325781
transaction_id             | 45035996273728326
transaction_description    | user dbadmin (select * from sessions;)
statement_start            | 2011-01-03 15:36:13.896288
statement_id               | 10
last_statement_duration_us | 14978
current_statement          | select * from sessions;
ssl_state                  | None
authentication_method      | Trust
-[ RECORD 2 ]--------------+----------------------------------------------------
node_name                  | v_vmartdb_node0003
user_name                  | dbadmin
client_hostname            | 127.0.0.1:56367
client_pid                 | 1191
login_timestamp            | 2011-01-03 15:31:44.939302-05
session_id                 | stress06-25663:0xbec
client_label               |
transaction_start          | 2011-01-03 15:34:51.05939
transaction_id             | 54043195528458775
transaction_description    | user dbadmin (COPY Mart_Fact FROM '/data/Mart_Fact.tbl'
                             DELIMITER '|' NULL '\\n' DIRECT;)
statement_start            | 2011-01-03 15:35:46.436748
statement_id               | 5
last_statement_duration_us | 1591403
current_statement          | COPY Mart_Fact FROM '/data/Mart_Fact.tbl' DELIMITER '|'
                             NULL '\\n' DIRECT;
ssl_state                  | None
authentication_method      | Trust

Interrupt the COPY DIRECT statement running in session stress06-25663:0xbec:

=> \x
Expanded display is off.
=> SELECT INTERRUPT_STATEMENT('stress06-25663:0x1537', 5);
                       interrupt_statement
------------------------------------------------------------------
 Statement interrupt sent. Check v_monitor.sessions for progress.
(1 row)

Verify that the interrupted statement is no longer active by looking at the current_statement column in the SESSIONS system table. This column becomes blank when the statement is interrupted:

=> SELECT * FROM SESSIONS;
-[ RECORD 1 ]--------------+----------------------------------------------------
node_name                  | v_vmartdb_node0001
user_name                  | dbadmin
client_hostname            | 127.0.0.1:52110
client_pid                 | 4554
login_timestamp            | 2011-01-03 14:05:40.252625-05
session_id                 | stress04-4325:0x14
client_label               |
transaction_start          | 2011-01-03 14:05:44.325781
transaction_id             | 45035996273728326
transaction_description    | user dbadmin (select * from sessions;)
statement_start            | 2011-01-03 15:36:13.896288
statement_id               | 10
last_statement_duration_us | 14978
current_statement          | select * from sessions;
ssl_state                  | None
authentication_method      | Trust
-[ RECORD 2 ]--------------+----------------------------------------------------
node_name                  | v_vmartdb_node0003
user_name                  | dbadmin
client_hostname            | 127.0.0.1:56367
client_pid                 | 1191
login_timestamp            | 2011-01-03 15:31:44.939302-05
session_id                 | stress06-25663:0xbec
client_label               |
transaction_start          | 2011-01-03 15:34:51.05939
transaction_id             | 54043195528458775
transaction_description    | user dbadmin (COPY Mart_Fact FROM '/data/Mart_Fact.tbl'
                             DELIMITER '|' NULL '\\n' DIRECT;)
statement_start            | 2011-01-03 15:35:46.436748
statement_id               | 5
last_statement_duration_us | 1591403
current_statement          |
ssl_state                  | None
authentication_method      | Trust

14.8 - RELEASE_ALL_JVM_MEMORY

Forces all sessions to release the memory consumed by their Java Virtual Machines (JVM).

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

RELEASE_ALL_JVM_MEMORY();

Privileges

Must be a superuser.

Examples

The following example demonstrates viewing the JVM memory use in all open sessions, then calling RELEASE_ALL_JVM_MEMORY() to release the memory:

=> select user_name,external_memory_kb FROM V_MONITOR.SESSIONS;
 user_name | external_memory_kb
-----------+---------------
 dbadmin   |         79705
(1 row)

=> SELECT RELEASE_ALL_JVM_MEMORY();
                           RELEASE_ALL_JVM_MEMORY
-----------------------------------------------------------------------------
 Close all JVM sessions command sent. Check v_monitor.sessions for progress.
(1 row)

=> SELECT user_name,external_memory_kb FROM V_MONITOR.SESSIONS;
 user_name | external_memory_kb
-----------+---------------
 dbadmin   |             0
(1 row)

14.9 - RELEASE_JVM_MEMORY

Terminates a Java Virtual Machine (JVM), making available the memory the JVM was using.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

RELEASE_JVM_MEMORY();

Privileges

None.

Examples

User session opened. RECORD 2 shows the user session running COPY DIRECT statement.

=> SELECT RELEASE_JVM_MEMORY();
           release_jvm_memory
-----------------------------------------
Java process killed and memory released
(1 row)

14.10 - RESERVE_SESSION_RESOURCE

Reserves memory resources from the general resource pool for the exclusive use of the Vertica backup and restore process.

Reserves memory resources from the general resource pool for the exclusive use of the Vertica backup and restore process. No other Vertica process can access reserved resources. If insufficient resources are available, Vertica queues the reservation request.

This meta-function is a session level reservation. When a session ends Vertica automatically releases any resources reserved in that session. Because the meta-function operates at the session level, the resource name does not need to be unique across multiple sessions.

You can view reserved resources by querying the SESSIONS table.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

RESERVE_SESSION_RESOURCE ( 'name', memory)

Parameters

name: The name of the resource to reserve.
memory: The amount of memory in kilobytes to allocate to the resource.

Privileges

None

Examples

Reserve 1024 kilobytes of memory for the backup and restore process:

=> SELECT reserve_session_resource('VBR_RESERVE',1024);
   -[ RECORD 1 ]------------+----------------
   reserve_session_resource | Grant succeed

14.11 - RESET_SESSION

Applies your default connection string configuration settings to your current session.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

RESET_SESSION()

Examples

The following example shows how you use RESET_SESSION.

Resets the current client connection string to the default connection string settings:

=> SELECT RESET_SESSION();
    RESET_SESSION
----------------------
 Reset session: done.
(1 row)

15 - Storage functions

This section contains storage management functions specific to Vertica.

15.1 - ALTER_LOCATION_LABEL

Adds a label to a storage location, or changes or removes an existing label.

Adds a label to a storage location, or changes or removes an existing label. You can change a location label if it is not specified by any storage policy.

Caution

If you label a storage location that contains data, Vertica moves the data to an unlabeled location, if one exists. To prevent data movement between storage locations, labels should be applied either to all storage locations or none.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

ALTER_LOCATION_LABEL ( 'path' , '[node]' , '[location-label]' )

Parameters

path

The storage location path.

node

The node where the label change is applied. If you supply an empty string, Vertica applies the change across all cluster nodes.

location-label

The label to assign to the specified storage location.

If you supply an empty string, Vertica removes that storage location's label.

You can remove a location label only if the following conditions are both true:

No database object has a storage policy that specifies this label.
The labeled location is not the last available storage for the objects associated with it.

Privileges

Superuser

Examples

The following ALTER_LOCATION_LABEL statement applies across all cluster nodes the label SSD to the storage location /home/dbadmin/SSD/tables:

=> SELECT ALTER_LOCATION_LABEL('/home/dbadmin/SSD/tables','', 'SSD');
          ALTER_LOCATION_LABEL
---------------------------------------
 /home/dbadmin/SSD/tables label changed.
(1 row)

15.2 - ALTER_LOCATION_USE

Alters the type of data that a storage location holds.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

ALTER_LOCATION_USE ( 'path' , '[node]' , 'usage' )

Arguments

path

Where the storage location is mounted.

node

The Vertica node on which to alter the storage location. To alter the location on all cluster nodes in a single transaction, use an empty string (''). If the usage is SHARED TEMP or SHARED USER, you must alter it on all nodes.

usage

One of the following:

DATA: The storage location stores only data files.
TEMP: The location stores only temporary files that are created during loads or queries.
DATA,TEMP: The location can store both types of files.

Privileges

Superuser

Restrictions

You cannot change a storage location from a USER usage type if you created the location that way, or to a USER type if you did not. You can change a USER storage location to specify DATA (storing TEMP files is not supported). However, doing so does not affect the primary objective of a USER storage location, to be accessible by non-dbadmin users with assigned privileges.

You cannot change a storage location from SHARED TEMP or SHARED USER to SHARED DATA or the reverse.

Monitoring storage locations

For information about the disk storage used on each node, query the DISK_STORAGE system table.

Examples

The following example alters a storage location across all cluster nodes to store only data:

=> SELECT ALTER_LOCATION_USE ('/thirdSL/' , '' , 'DATA');

15.3 - CLEAR_CACHES

Clears the Vertica internal cache files.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CLEAR_CACHES ( )

Privileges

Superuser

Notes

If you want to run benchmark tests for your queries, in addition to clearing the internal Vertica cache files, clear the Linux file system cache. The kernel uses unallocated memory as a cache to hold clean disk blocks. If you are running version 2.6.16 or later of Linux and you have root access, you can clear the kernel file system cache as follows:

Make sure that all data in the cache is written to disk:
```
# sync
```
Writing to the drop_caches file causes the kernel to drop clean caches, entries, and inodes from memory, causing that memory to become free, as follows:
- To clear the page cache:
```
# echo 1 > /proc/sys/vm/drop_caches
```
- To clear the entries and inodes:
```
# echo 2 > /proc/sys/vm/drop_caches
```
- To clear the page cache, entries, and inodes:
```
# echo 3 > /proc/sys/vm/drop_caches
```

Examples

The following example clears the Vertica internal cache files:

=> SELECT CLEAR_CACHES();
 CLEAR_CACHES
--------------
 Cleared
(1 row)

15.4 - CLEAR_OBJECT_STORAGE_POLICY

Removes a user-defined storage policy from the specified database, schema or table.

Removes a user-defined storage policy from the specified database, schema or table. Storage containers at the previous policy's labeled location are moved to the default location. By default, this move occurs after all pending mergeout tasks return.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

CLEAR_OBJECT_STORAGE_POLICY ( 'object-name' [,'key-min', 'key-max'] [, 'enforce-storage-move' ] )

Parameters

object-name

The object to clear, one of the following:

database: Clears database of its storage policy.
[database.]schema: Clears schema of its storage policy.
[[database.]schema.]table: Clears table of its storage policy. If table is in any schema other than public, you must supply the schema name.

In all cases, database must be the name of the current database.

key-min key-max

Valid only if object-name is a table, specifies the range of table partition key values stored at the labeled location.

enforce-storage-move

Specifies when the Tuple Mover moves all existing storage containers for the specified object to its default storage location:

false (default): Move storage containers only after all pending mergeout tasks return.
true: Immediately move all storage containers to the new location.

Tip

You can also enforce all storage policies immediately by calling Vertica meta-function ENFORCE_OBJECT_STORAGE_POLICY.

Privileges

Superuser

Examples

This following statement clears the storage policy for table store.store_orders_fact. The true argument specifies to implement the move immediately:

=> SELECT CLEAR_OBJECT_STORAGE_POLICY ('store.store_orders_fact', 'true');
                         CLEAR_OBJECT_STORAGE_POLICY
-----------------------------------------------------------------------------
 Object storage policy cleared.
Task: moving storages
(Table: store.store_orders_fact) (Projection: store.store_orders_fact_b0)
(Table: store.store_orders_fact) (Projection: store.store_orders_fact_b1)

(1 row)

15.5 - DO_TM_TASK

Runs a (TM) operation and commits current transactions.

Runs a Tuple Mover (TM) operation and commits current transactions. You can limit this operation to a specific table or projection. When started using this function, the TM uses the GENERAL resource pool instead of the TM resource pool.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

DO_TM_TASK('task'[, '[[database.]schema.]{ table | projection}]' )

Parameters

task

Specifies one of the following tuple mover operations:

mergeout: Consolidates ROS containers and purges deleted records. For details, seeMergeout.
reshardmergeout: Realigns storage containers to the shard definitions created by a RESHARD_DATABASE call. Specify a table or projection and a range of partition values to limit the scope of the reshardmergeout operations.
analyze_row_count: Collects a minimal set of statistics and aggregate row counts for the specified projections, and saves it in the database catalog. Collects the number of rows in the specified projection. If you specify a table name, DO_TM_TASK returns the row counts for all projections of that table. For details, see Analyzing row counts.
update_storage_catalog (recommended only for Eon Mode): Updates the catalog with metadata on bundled table data. For details, see Writing bundle metadata to the catalog.

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

table|projection

Applies task to the specified table or projection. If you specify a projection and it is not found, DO_TM_TASK looks for a table with that name and, if found, applies the task to it and all projections associated with it.

If you specify no table or projection, the task is applied to all database tables and their projections.

Privileges

Schema: USAGE
Table: One of INSERT, UPDATE, or DELETE

Examples

The following example performs a mergeout on all projections in a table:

=> SELECT DO_TM_TASK('mergeout', 't1');

You can perform a re-shard mergeout task on a range of partitions of a table:

=> SELECT DO_TM_TASK('reshardmergeout', 'store_orders', '2001', '2005');

15.6 - DROP_LOCATION

Permanently removes a retired storage location.

Permanently removes a retired storage location. This operation cannot be undone. You must first retire a storage location with RETIRE_LOCATION before dropping it; you cannot drop a storage location that is in use.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

DROP_LOCATION ( 'path', 'node' )

Arguments

path: Where the storage location to drop is mounted.
node: The Vertica node on which to drop the location. To perform this operation on all nodes, use an empty string (''). If the storage location is SHARED, you must perform this operation on all nodes.

Privileges

Superuser

Storage locations with temp and data files

If you use a storage location to store data and then alter it to store only temp files, the location can still contain data files. Vertica does not let you drop a storage location containing data files. You can use the MOVE_RETIRED_LOCATION_DATA function to manually merge out the data files from the storage location, or you can drop partitions. Deleting data files does not work.

Examples

The following example shows how to drop a previously retired storage location on v_vmart_node0003:

=> SELECT DROP_LOCATION('/data', 'v_vmart_node0003');

15.7 - ENFORCE_OBJECT_STORAGE_POLICY

Applies storage policies of the specified object immediately.

Enterprise Mode only

Applies storage policies of the specified object immediately. By default, the Tuple Mover enforces object storage policies after all pending mergeout operations are complete. Calling this function is equivalent to setting the enforce argument when using RETIRE_LOCATION. You typically use this function as the last step before dropping a storage location.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

ENFORCE_OBJECT_STORAGE_POLICY ( 'object-name' [,'key-min', 'key-max'] )

Arguments

object-name

The database object whose storage policies are to be applied, one of the following:

database: Applies database storage policies.
[database.]schema: Applies schema storage policies.
[[database.]schema.]table: Applies table storage policies. If table is in any schema other than public, you must supply the schema name.

In all cases, database must be the name of the current database.

key-min, key-max

Valid only if object-name is a table, specifies the range of table partition key values on which to perform the move.

Privileges

One of the following:

Superuser
Object owner and access to its storage location.

Examples

Apply storage policy updates to the test table:

=> SELECT ENFORCE_OBJECT_STORAGE_POLICY ('test');

15.8 - MEASURE_LOCATION_PERFORMANCE

Measures a storage location's disk performance.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

MEASURE_LOCATION_PERFORMANCE ( 'path', 'node' )

Parameters

path: Specifies where the storage location to measure is mounted.
node: The Vertica node where the location to be measured is available. To obtain a list of all node names on the cluster, query system table DISK_STORAGE.

Privileges

Superuser

Notes

If you intend to create a tiered disk architecture in which projections, columns, and partitions are stored on different disks based on predicted or measured access patterns, you need to measure storage location performance for each location in which data is stored. You do not need to measure storage location performance for temp data storage locations because temporary files are stored based on available space.
The method of measuring storage location performance applies only to configured clusters. If you want to measure a disk before configuring a cluster see Measuring storage performance.
Storage location performance equates to the amount of time it takes to read and write 1MB of data from the disk. This time equates to:
```
IO-time = (time-to-read-write-1MB + time-to-seek) = (1/throughput + 1/latency)
```
Throughput is the average throughput of sequential reads/writes (units in MB per second).

Latency is for random reads only in seeks (units in seeks per second)

Note
The IO time of a faster storage location is less than a slower storage location.

Examples

The following example measures the performance of a storage location on v_vmartdb_node0004:

=> SELECT MEASURE_LOCATION_PERFORMANCE('/secondVerticaStorageLocation/' , 'v_vmartdb_node0004');
WARNING:  measure_location_performance can take a long time. Please check logs for progress
           measure_location_performance
--------------------------------------------------
 Throughput : 122 MB/sec. Latency : 140 seeks/sec

15.9 - MOVE_RETIRED_LOCATION_DATA

Moves all data from the specified retired storage location or from all retired storage locations in the database.

Moves all data from the specified retired storage location or from all retired storage locations in the database. MOVE_RETIRED_LOCATION_DATA migrates the data to non-retired storage locations according to the storage policies of the objects whose data is stored in the location. This function returns only after it completes migration of all affected storage location data.

Note

The Tuple Mover migrates data of retired storage locations when it consolidates data into larger ROS containers.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

MOVE_RETIRED_LOCATION_DATA( ['location-path'] [, 'node'] )

Arguments

location-path: The path of the storage location as specified in the LOCATION_PATH column of system table STORAGE_LOCATIONS. This storage location must be marked as retired.
If you omit this argument, MOVE_RETIRED_LOCATION_DATA moves data from all retired storage locations.
node: The node on which to move data of the retired storage location. If location-path is undefined on node, this function returns an error.
If you omit this argument, MOVE_RETIRED_LOCATION_DATA moves data from*location-path* on all nodes.

Privileges

Superuser

Examples

Query system table STORAGE_LOCATIONS to show which storage locations are retired:

=> SELECT node_name, location_path, location_label, is_retired FROM STORAGE_LOCATIONS
   WHERE is_retired = 't';
    node_name     |    location_path     | location_label | is_retired
------------------+----------------------+----------------+------------
 v_vmart_node0001 | /home/dbadmin/SSDLoc | ssd            | t
 v_vmart_node0002 | /home/dbadmin/SSDLoc | ssd            | t
 v_vmart_node0003 | /home/dbadmin/SSDLoc | ssd            | t
(3 rows)

Query system table STORAGE_LOCATIONS for the location of the messages table, which is currently stored in retired storage location ssd:

=> SELECT node_name, total_row_count, location_label FROM STORAGE_CONTAINERS
   WHERE projection_name ILIKE 'messages%';
    node_name     | total_row_count | location_label
------------------+-----------------+----------------
 v_vmart_node0001 |          333514 | ssd
 v_vmart_node0001 |          333255 | ssd
 v_vmart_node0002 |          333255 | ssd
 v_vmart_node0002 |          333231 | ssd
 v_vmart_node0003 |          333231 | ssd
 v_vmart_node0003 |          333514 | ssd
(6 rows)

Call MOVE_RETIRED_LOCATION_DATA to move the data off the ssd storage location.

=> SELECT MOVE_RETIRED_LOCATION_DATA('/home/dbadmin/SSDLoc');
          MOVE_RETIRED_LOCATION_DATA
-----------------------------------------------
 Move data off retired storage locations done

(1 row)

Repeat the previous query to verify the storage location of the messages table:


=> SELECT node_name, total_row_count, storage_type, location_label FROM storage_containers
   WHERE projection_name ILIKE 'messages%';
    node_name     | total_row_count | location_label
------------------+-----------------+----------------
 v_vmart_node0001 |          333255 | base
 v_vmart_node0001 |          333514 | base
 v_vmart_node0003 |          333514 | base
 v_vmart_node0003 |          333231 | base
 v_vmart_node0002 |          333231 | base
 v_vmart_node0002 |          333255 | base
(6 rows)

15.10 - RESTORE_LOCATION

Restores a storage location that was previously retired with RETIRE_LOCATION.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

RESTORE_LOCATION ( 'path', 'node' )

Arguments

path: Where to mount the retired storage location.
node: The Vertica node on which to restore the location. To perform this operation on all nodes, use an empty string (''). If the storage location is SHARED, you must perform this operation on all nodes.
The operation fails if you dropped any locations.

Privileges

Superuser

Effects of restoring a previously retired location

After restoring a storage location, Vertica re-ranks all of the cluster storage locations. It uses the newly restored location to process queries as determined by its rank.

Monitoring storage locations

For information about the disk storage used on each node, query the DISK_STORAGE system table.

Examples

Restore a retired storage location on node4:

=> SELECT RESTORE_LOCATION ('/thirdSL/' , 'v_vmartdb_node0004');

15.11 - RETIRE_LOCATION

Deactivates the specified storage location.

Deactivates the specified storage location. To obtain a list of all existing storage locations, query the STORAGE_LOCATIONS system table.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

RETIRE_LOCATION ( 'path', 'node' [, enforce ] )

Arguments

path: Where the storage location to retire is mounted.
node: The Vertica node on which to retire the location. To perform this operation on all nodes, use an empty string (''). If the storage location is SHARED, you must perform this operation on all nodes.
enforce: If true, the location label is set to an empty string and the data is moved elsewhere. The location can then be dropped without errors or warnings. Use this argument to expedite dropping a location.

Privileges

Superuser

Effects of retiring a storage location

RETIRE_LOCATION checks that the location is not the only storage for data and temp files. At least one location must exist on each node to store data and temp files. However, you can store both sorts of files in either the same location or separate locations.

If a location is the last available storage for its associated objects, you can retire it only if you set enforce to true.

When you retire a storage location:

No new data is stored at the retired location, unless you first restore it using RESTORE_LOCATION.
By default, if the storage location being retired contains stored data, the data is not moved. Thus, you cannot drop the storage location. Instead, Vertica removes the stored data through one or more mergeouts. To drop the location immediately after retiring it, set enforce to true.
If the storage location being retired is used only for temp files or you use enforce, you can drop the location. See Dropping storage locations and DROP_LOCATION.

Monitoring storage locations

For information about the disk storage used on each node, query the DISK_STORAGE system table.

Examples

The following examples show two approaches to retiring a storage location.

You can retire a storage location and its data will be moved out automatically at a future time:

=> SELECT RETIRE_LOCATION ('/data' , 'v_vmartdb_node0004');

You can specify that data in the storage location be moved immediately, so that you can then drop the location without waiting:

=> SELECT RETIRE_LOCATION ('/data' , 'v_vmartdb_node0004', true);

15.12 - SET_LOCATION_PERFORMANCE

Sets disk performance for a storage location.

Note

Before calling this function, call MEASURE_LOCATION_PERFORMANCE to obtain the location's throughput and average latency .

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SET_LOCATION_PERFORMANCE ( 'path', 'node' , 'throughput', 'average-latency')

Parameters

path: Specifies where the storage location to set is mounted.
node: Specifies the Vertica node where the location to set is available.
throughput: Specifies the throughput for the location, set to a value ≥1.
average-latency: Specifies the average latency for the location, set to a value ≥1.

Privileges

Superuser

Examples

The following example sets the performance of a storage location on node2 to a throughput of 122 megabytes per second and a latency of 140 seeks per second.

=> SELECT SET_LOCATION_PERFORMANCE('/secondVerticaStorageLocation/','node2','122','140');

15.13 - SET_OBJECT_STORAGE_POLICY

Creates or changes the storage policy of a database object by assigning it a labeled storage location.

Creates or changes the storage policy of a database object by assigning it a labeled storage location. The Tuple Mover uses this location to store new and existing data for this object. If the object already has an active storage policy, calling SET_OBJECT_STORAGE_POLICY sets this object's default storage to the new labeled location. Existing data for the object is moved to the new location.

Note

You cannot create a storage policy on a USER type storage location.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SET_OBJECT_STORAGE_POLICY (
  '[[database.]schema.]object-name', 'location-label'
   [,'key-min', 'key-max'] [, 'enforce-storage-move' ] )

Parameters

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

object-name

Identifies the database object assigned to a labeled storage location. The object-name can resolve to a database, schema, or table.

location-label

The label of object-name's storage location.

key-min key-max

Valid only if object-name is a table, specifies the range of table partition key values to store at the labeled location.

enforce-storage-move

Specifies when the Tuple Mover moves all existing storage containers for object-name to the labeled storage location:

false (default): Move storage containers only after all pending mergeout tasks return.
true: Immediately move all storage containers to the new location.

Tip

You can also enforce all storage policies immediately by calling Vertica meta-function ENFORCE_OBJECT_STORAGE_POLICY

Privileges

One of the following:

Superuser
Object owner and access to its storage location.

Examples

See Clearing storage policies

16 - Table functions

This section contains functions for managing tables and constraints.

See also the V_CATALOG.TABLE_CONSTRAINTS system table.

16.1 - ANALYZE_CONSTRAINTS

Analyzes and reports on constraint violations within the specified scope.

Analyzes and reports on constraint violations within the specified scope

You can enable automatic enforcement of primary key, unique key, and check constraints when INSERT, UPDATE, MERGE, or COPY statements execute. Alternatively, you can use ANALYZE_CONSTRAINTS to validate constraints after issuing these statements. Refer to Constraint enforcement for more information.

ANALYZE_CONSTRAINTS performs a lock in the same way that SELECT * FROM t1 holds a lock on table t1. See LOCKS for additional information.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

ANALYZE_CONSTRAINTS ('[[[database.]schema.]table ]' [, 'column[,...]'] )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table: Identifies the table to analyze. If you omit specifying a schema, Vertica uses the current schema search path. If set to an empty string, Vertica analyzes all tables in the current schema.
column: The column in table to analyze. You can specify multiple comma-delimited columns. Vertica narrows the scope of the analysis to the specified columns. If you omit specifying a column, Vertica analyzes all columns in table.

Privileges

Schema: USAGE
Table: SELECT

Detecting constraint violations during a load process

Vertica checks for constraint violations when queries are run, not when data is loaded. To detect constraint violations as part of the load process, use a COPY statement with the NO COMMIT option. By loading data without committing it, you can run a post-load check of your data using the ANALYZE_CONSTRAINTS function. If the function finds constraint violations, you can roll back the load because you have not committed it.

If ANALYZE_CONSTRAINTS finds violations, such as when you insert a duplicate value into a primary key, you can correct errors using the following functions. Effects last until the end of the session only:

Important

If a check constraint SQL expression evaluates to an unknown for a given row because a column within the expression contains a null, the row passes the constraint condition.

Return values

ANALYZE_CONSTRAINTS returns results in a structured set (see table below) that lists the schema name, table name, column name, constraint name, constraint type, and the column values that caused the violation.

If the result set is empty, then no constraint violations exist; for example:

> SELECT ANALYZE_CONSTRAINTS ('public.product_dimension', 'product_key');
Schema Name | Table Name | Column Names | Constraint Name | Constraint Type | Column Values
-------------+------------+--------------+-----------------+-----------------+---------------
(0 rows)

The following result set shows a primary key violation, along with the value that caused the violation ('10'):

=> SELECT ANALYZE_CONSTRAINTS ('');
Schema Name | Table Name | Column Names | Constraint Name | Constraint Type | Column Values
-------------+------------+--------------+-----------------+-----------------+---------------
store         t1           c1             pk_t1             PRIMARY           ('10')
(1 row)

The result set columns are described in further detail in the following table:

Column Name	Data Type	Description
`Schema Name`	VARCHAR	The name of the schema.
`Table Name`	VARCHAR	The name of the table, if specified.
`Column Names`	VARCHAR	A list of comma-delimited columns that contain constraints.
`Constraint Name`	VARCHAR	The given name of the primary key, foreign key, unique, check, or not null constraint, if specified.
`Constraint Type`	VARCHAR	Identified by one of the following strings: `PRIMARY KEY` `FOREIGN KEY` `UNIQUE` `CHECK` `NOT NULL`
`Column Values`	VARCHAR	Value of the constraint column, in the same order in which `Column Names` contains the value of that column in the violating row. When interpreted as SQL, the value of this column forms a list of values of the same type as the columns in `Column Names`; for example: `('1'),` `('1', 'z')`

Examples

See Detecting constraint violations.

16.2 - ANALYZE_CORRELATIONS

This function is deprecated and will be removed in a future release.

Deprecated

This function is deprecated and will be removed in a future release.

Analyzes the specified tables for pairs of columns that are strongly correlated. ANALYZE_CORRELATIONS stores the 20 pairs with the strongest correlation. ANALYZE_CORRELATIONS also analyzes statistics.

ANALYZE_CORRELATIONS analyzes only pairwise single-column correlations.

For example, state name and country name columns are strongly correlated because the city name usually, but perhaps not always, identifies the state name. The city of Conshohoken is uniquely associated with Pennsylvania, while the city of Boston exists in Georgia, Indiana, Kentucky, New York, Virginia, and Massachusetts. In this case, city name is strongly correlated with state name.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Stable

Syntax

ANALYZE_CORRELATIONS ('[[[database.]schema.]table ]' [, 'recalculate'] )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table-name: Identifies the table to analyze. If you omit specifying a schema, Vertica uses the current schema search path. If set to an empty string, Vertica analyzes all tables in the current schema.
recalculate: Boolean that specifies whether to analyze correlated columns that were previously analyzed.

Note
Column correlation analysis typically needs to be done only once.

Default:false

Privileges

One of the following:

Superuser
User with USAGE privilege on the design schema

Examples

In the following example, ANALYZE_CORRELATIONS analyzes column correlations for all tables in the public schema, even if they currently exist:

=> SELECT ANALYZE_CORRELATIONS ('public.*', 'true');
 ANALYZE_CORRELATIONS
----------------------
                    0
(1 row)

16.3 - COPY_TABLE

Copies one table to another.

Copies one table to another. This lightweight, in-memory function copies the DDL and all user-created projections from the source table. Projection statistics for the source table are also copied. Thus, the source and target tables initially have identical definitions and share the same storage.

Note

Although they share storage space, Vertica regards the tables as discrete objects for license capacity purposes. For example, a single-terabyte table and its copy initially consume only one TB of space. However, your Vertica license regards them as separate objects that consume two TB of space.

After the copy operation is complete, the source and copy tables are independent of each other, so you can perform DML operations on one table without impacting the other. These operations can increase the overall storage required for both tables.

Caution

If you create multiple copies of the same table concurrently, one or more of the copy operations is liable to fail. Instead, copy tables sequentially.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

COPY_TABLE (
    '[[database.]schema.]source-table',
    '[[database.]schema.]target-table'
)

Parameters

`[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
source-table: The source table to copy. Vertica copies all data from this table to the target table.
target-table: The target table of the source table. If the target table already exists, Vertica appends the source to the existing table.
If the table does not exist, Vertica creates a table from the source table's definition, by calling CREATE TABLE with LIKE and INCLUDING PROJECTIONS clause. The new table inherits ownership from the source table. For details, see Replicating a table.

Privileges

Non-superuser:

Source table: SELECT
Target schema/table (new): CREATE
Target table (existing): INSERT

Table attribute requirements

The following attributes of both tables must be identical:

Column definitions, including NULL/NOT NULL constraints
Segmentation
Partitioning expression
Number of projections
Projection sort order
Primary and unique key constraints. However, the key constraints do not have to be identically enabled.

Note
If the target table has primary or unique key constraints enabled and moving the partitions will insert duplicate key values into the target table, Vertica rolls back the operation. Enforcing constraints requires disk reads and can slow the copy process.
Number and definitions of text indices.
If the destination table already exists, the source and destination tables must have identical access policies.

Additionally, If access policies exist on the source table, the following must be true:

Access policies on both tables must be identical.
One of the following must be true:
- The executing user owns the source table.
- AccessPolicyManagementSuperuserOnly is set to true. See Managing access policies for details.

Table restrictions

The following restrictions apply to the source and target tables:

If the source and target partitions are in different storage tiers, Vertica returns a warning but the operation proceeds. The partitions remain in their existing storage tier.
If the source table contains a sequence, Vertica converts the sequence to an integer before copying it to the target table. If the target table contains IDENTITY or named sequence columns, Vertica cancels the copy and displays an error message.
The following tables cannot be used as sources or targets:
- Temporary tables
- Virtual tables
- System tables
- External tables

Examples

If you call COPY_TABLE and the target table does not exist, the function creates the table automatically. In the following example, COPY_TABLE creates the target table public.newtable. Vertica also copies all the constraints associated with the source table public.product_dimension except foreign key constraints:

=> SELECT COPY_TABLE ( 'public.product_dimension', 'public.newtable');
-[ RECORD 1 ]--------------------------------------------------
copy_table | Created table public.newtable.
Copied table public.product_dimension to public.newtable

16.4 - DISABLE_DUPLICATE_KEY_ERROR

Disables error messaging when Vertica finds duplicate primary or unique key values at run time (for use with key constraints that are not automatically enabled).

Disables error messaging when Vertica finds duplicate primary or unique key values at run time (for use with key constraints that are not automatically enabled). Queries execute as though no constraints are defined on the schema. Effects are session scoped.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

DISABLE_DUPLICATE_KEY_ERROR();

Privileges

Superuser

Examples

When you call DISABLE_DUPLICATE_KEY_ERROR, Vertica issues warnings letting you know that duplicate values will be ignored, and incorrect results are possible. DISABLE_DUPLICATE_KEY_ERROR is for use only for key constraints that are not automatically enabled.

=> select DISABLE_DUPLICATE_KEY_ERROR();
WARNING 3152:  Duplicate values in columns marked as UNIQUE will now be ignored for the remainder of your session or until reenable_duplicate_key_error() is called
WARNING 3539:  Incorrect results are possible. Please contact Vertica Support if unsure
 disable_duplicate_key_error
------------------------------
 Duplicate key error disabled
(1 row)

16.5 - INFER_EXTERNAL_TABLE_DDL

This function is deprecated and will be removed in a future release.

Deprecated

This function is deprecated and will be removed in a future release. Instead, use INFER_TABLE_DDL.

Inspects a file in Parquet, ORC, or Avro format and returns a CREATE EXTERNAL TABLE AS COPY statement that can be used to read the file. This statement might be incomplete. It could also contain more columns or columns with longer names than what Vertica supports; this function does not enforce Vertica system limits. Always inspect the output and address any issues before using it to create a table.

This function supports partition columns for the Parquet, ORC, and Avro formats, inferred from the input path. Because partitioning is done through the directory structure, there might not be enough information to infer the type of partition columns. In this case, this function shows these columns with a data type of UNKNOWN and emits a warning.

The function handles most data types, including complex types. If an input type is not supported in Vertica, the function emits a warning.

By default, the function uses strong typing for complex types. You can instead treat the column as a flexible complex type by setting the vertica_type_for_complex_type parameter to LONG VARBINARY.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

INFER_EXTERNAL_TABLE_DDL( path USING PARAMETERS param=value[,...] )

Arguments

path: Path to a file or directory. Any path that is valid for COPY and uses a file format supported by this function is valid.

Parameters

format: Input format (string), one of 'Parquet', 'ORC', or 'Avro'. This parameter is required.
table_name: The name of the external table to create. This parameter is required.
Do not include a schema name as part of the table name; use the table_schema parameter.
table_schema: The schema in which to create the external table. If omitted, the function does not include a schema in the output.
vertica_type_for_complex_type: Type used to represent all columns of complex types, if you do not want to expand them fully. The only supported value is LONG VARBINARY. For more information, see Flexible complex types.

Privileges

Non-superuser: READ privileges on the USER-accessible storage location.

Examples

In the following example, the input file contains data for a table with two integer columns. The table definition can be fully inferred, and you can use the returned SQL statement as-is.

=> SELECT INFER_EXTERNAL_TABLE_DDL('/data/orders/*.orc'
        USING PARAMETERS format = 'orc', table_name = 'orders');

                INFER_EXTERNAL_TABLE_DDL
--------------------------------------------------------------------------------------------------
create external table "orders" (
  "id" int,
  "quantity" int
) as copy from '/data/orders/*.orc' orc;
(1 row)

To create a table in a schema, use the table_schema parameter. Do not add it to the table name; the function treats it as a name with a period in it, not a schema.

The following example shows output with complex types. You can use the definition as-is or modify the VARCHAR sizes:

=> SELECT INFER_EXTERNAL_TABLE_DDL('/data/people/*.parquet'
        USING PARAMETERS format = 'parquet', table_name = 'employees');
WARNING 9311:  This generated statement contains one or more varchar/varbinary columns which default to length 80
                    INFER_EXTERNAL_TABLE_DDL
-------------------------------------------------------------------------
 create external table "employees"(
  "employeeID" int,
  "personal" Row(
    "name" varchar,
    "address" Row(
      "street" varchar,
      "city" varchar,
      "zipcode" int
    ),
    "taxID" int
  ),
  "department" varchar
 ) as copy from '/data/people/*.parquet' parquet;
(1 row)

In the following example, the input file contains a map in the "prods" column. You can read a map as an array of rows:

=> SELECT INFER_EXTERNAL_TABLE_DDL('/data/orders.parquet'
    USING PARAMETERS format='parquet', table_name='orders');
WARNING 9311:  This generated statement contains one or more varchar/varbinary columns which default to length 80
                INFER_EXTERNAL_TABLE_DDL
------------------------------------------------------------------------
 create external table "orders"(
  "orderkey" int,
  "custkey" int,
  "prods" Array[Row(
    "key" varchar,
    "value" numeric(12,2)
  )],
  "orderdate" date
 ) as copy from '/data/orders.parquet' parquet;
(1 row)

In the following example, the data is partitioned by region. The function was not able to infer the data type and reports UNKNOWN:

=> SELECT INFER_EXTERNAL_TABLE_DDL('/data/sales/*/*
        USING PARAMETERS format = 'parquet', table_name = 'sales');
WARNING 9262: This generated statement is incomplete because of one or more unknown column types.
Fix these data types before creating the table
                INFER_EXTERNAL_TABLE_DDL
------------------------------------------------------------------------
 create external table "sales"(
  "tx_id" int,
  "date" date,
  "region" UNKNOWN
) as copy from '/data/sales/*/*' PARTITION COLUMNS region parquet;
(1 row)

For VARCHAR and VARBINARY columns, this function does not specify a length. The Vertica default length for these types is 80 bytes. If the data values are longer, using this table definition unmodified could cause data to be truncated. Always review VARCHAR and VARBINARY columns to determine if you need to specify a length. This function emits a warning if the input file contains columns of these types:

WARNING 9311: This generated statement contains one or more varchar/varbinary columns which default to length 80

16.6 - INFER_TABLE_DDL

Inspects a file in Parquet, ORC, JSON, or Avro format and returns a CREATE TABLE or CREATE EXTERNAL TABLE statement based on its contents.

The returned statement might be incomplete if the input data contains ambiguous or unknown data types. It could also contain more columns or columns with longer names than what Vertica supports; this function does not enforce Vertica system limits. Always inspect the output and address any issues before using it to create a table.

This function supports partition columns, inferred from the input path. Because partitioning is done through the directory structure, there might not be enough information to infer the type of partition columns. In this case, this function shows these columns with a data type of UNKNOWN and emits a warning.

The function handles most data types, including complex types. If an input type is not supported in Vertica, the function emits a warning.

For VARCHAR and VARBINARY columns, this function does not specify a length. The Vertica default length for these types is 80 bytes. If the data values are longer, using the returned table definition unmodified could cause data to be truncated. Always review VARCHAR and VARBINARY columns to determine if you need to specify a length. This function emits a warning if the input file contains columns of these types:

WARNING 9311: This generated statement contains one or more varchar/varbinary columns which default to length 80

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

INFER_TABLE_DDL( path USING PARAMETERS param=value[,...] )

Arguments

path: Path to a file or glob. Any path that is valid for COPY and uses a file format supported by this function is valid. For all formats except JSON, if a glob specifies more than one file, this function reads a single, arbitrarily-chosen file. For JSON, the function might read more than one file. See JSON.

Parameters

format: Input format (string), one of 'Parquet', 'ORC', 'Avro', or 'JSON'. This parameter is required.
table_name: The name of the table to create. This parameter is required.
Do not include a schema name as part of the table name; use the table_schema parameter.
table_schema: The schema in which to create the table. If omitted, the function does not include a schema in the output.
table_type: The type of table to create, either 'native' or 'external'.
Default: 'native'
with_copy_statement: For native tables, whether to include a COPY statement in addition to the CREATE TABLE statement.
Default: false
one_line_result: Whether to return the DDL as a single line instead of pretty-printing. The single-line format might be easier to copy into SQL scripts.
Default: false (pretty-print)
max_files: (JSON only.) Maximum number of files in path to inspect, if path is a glob. Use this parameter to increase the amount of data the function considers, for example if you suspect variation among files. Files are chosen arbitrarily from the glob. For details, see JSON.
Default: 1
max_candidates: (JSON only.) Number of candidate table definitions to show. The function generates only one candidate per file, so if you increase max_candidates, also increase max_files. For details, see JSON.
Default: 1

Privileges

Non-superuser: READ privileges on the USER-accessible storage location.

JSON

JSON, unlike the other supported formats, does not embed a schema in data files. This function infers JSON table DDL by instead inspecting the raw data. Because raw data can be ambiguous or inconsistent, the function takes a different approach for this format.

For each input file, the function iterates through records to develop a candidate table definition. A top-level field that appears in any record is included as a column, even if not all records use it. If the same field appears in the file with different types, the function chooses a type that is consistent with all observed occurrences.

Consider a file with data about restaurants:

{
    "name" : "Pizza House",
    "cuisine" : "Italian",
    "location_city" : [],
    "chain" : true,
    "hours" : [],
    "menu" : [{"item" : "cheese pizza", "price" : 7.99},
              {"item" : "spinach pizza", "price" : 8.99},
              {"item" : "garlic bread", "price" : 4.99}]
}
{
    "name" : "Sushi World",
    "cuisine" : "Asian",
    "location_city" : ["Pittsburgh"],
    "chain" : false,
    "menu" : [{"item" : "maki platter", "price" : "21.95"},
              {"item" : "tuna roll", "price" : "4.95"}]
}

The first record contains two empty arrays, so there is not enough information to determine the element types. The second record has a string value for one of them, so the function can infer a type of VARCHAR for it. The other array element type remains unknown.

In the first record menu prices are numbers, but in the second they are strings. Both FLOAT and the string can be coerced to NUMERIC, so the function returns NUMERIC:

=> SELECT INFER_TABLE_DDL ('/data/restaurants.json'
    USING PARAMETERS table_name='restaurants', format='json');
WARNING 0:  This generated statement contains one or more varchar/varbinary types which default to length 80

                INFER_TABLE_DDL
------------------------------------------------------------------------
 Candidate matched 1 out of 1 total files:
  create table "restaurants"(
    "chain" bool,
    "cuisine" varchar,
    "hours" Array[UNKNWON],
    "location_city" Array[varchar],
    "menu" Array[Row(
      "item" varchar,
      "price" numeric
    )],
    "name" varchar
  );

(1 row)

All scalar types can be coerced to VARCHAR, so if a conflict cannot be resolved more specifically (as in the NUMERIC example), the function can still return a type. Complex types, however, cannot always be resolved in this way. In the following example, records in a file have conflicting definitions of the hours field:

{
    "name" : "Sushi World",
    "cuisine" : "Asian",
    "location_city" : ["Pittsburgh"],
    "chain" : false,
    "hours" : {"open" : "11:00", "close" : "22:00" }
}
{
    "name" : "Greasy Spoon",
    "cuisine" : "American",
    "location_city" : [],
    "chain" : "false",
    "hours" : {"open" : ["11:00","12:00"], "close" : ["21:00","22:00"] },
}

In the first record the value is a ROW with two TIME fields. In the second record the value is a ROW with two ARRAY[TIME] fields (representing weekday and weekend hours). These types are incompatible, so the function suggests a flexible complex type by using LONG VARBINARY:

=> SELECT INFER_TABLE_DDL ('/data/restaurants.json'
    USING PARAMETERS table_name='restaurants', format='json');
WARNING 0:  This generated statement contains one or more varchar/varbinary types which default to length 80

                INFER_TABLE_DDL
------------------------------------------------------------------------
 Candidate matched 1 out of 1 total files:
  create table "restaurants"(
    "chain" bool,
    "cuisine" varchar,
    "hours" long varbinary,
    "location_city" Array[varchar],
    "name" varchar
  );

(1 row)

If you call the function with a glob, by default it reads one file. Set max_files to a higher number to inspect more data. The function calculates one candidate table definition per file and returns the definition that covers the largest number of files.

Increasing the number of files does not, by itself, increase the number of candidates the function returns. With more files the function can consider more candidates, but by default it returns the single candidate that represents the largest number of files. To see more than one possible table definition, also set max_candidates. There is no benefit to setting max_candidates to a larger number than max_files.

In the following example, the glob contains two files that differ in the structure of the menu column. In the first file, the menu field has two fields:

{
    "name" : "Bob's pizzeria",
    "cuisine" : "Italian",
    "location_city" : ["Cambridge", "Pittsburgh"],
    "menu" : [{"item" : "cheese pizza", "price" : 8.25},
              {"item" : "spinach pizza", "price" : 10.50}]
}

In the second file, the menu has different offerings at different times of day:

{
    "name" : "Greasy Spoon",
    "cuisine" : "American",
    "location_city" : [],
    "menu" : [{"time" : "breakfast",
           "items" :
           [{"item" : "scrambled eggs", "price" : "3.99"}]
          },
          {"time" : "lunch",
           "items" :
           [{"item" : "grilled cheese", "price" : "3.95"},
        {"item" : "tuna melt", "price" : "5.95"},
        {"item" : "french fries", "price" : "1.99"}]}]
}

To see both candidates, raise both max_files and max_candidates:

=> SELECT INFER_TABLE_DDL ('/data/*.json'
    USING PARAMETERS table_name='restaurants', format='json',
max_files=3, max_candidates=3);
WARNING 0:  This generated statement contains one or more float types which might lose precision
WARNING 0:  This generated statement contains one or more varchar/varbinary types which default to length 80

                INFER_TABLE_DDL
------------------------------------------------------------------------
 Candidate matched 1 out of 2 total files:
  create table "restaurants"(
    "cuisine" varchar,
    "location_city" Array[varchar],
    "menu" Array[Row(
      "item" varchar,
      "price" float
    )],
    "name" varchar
  );
Candidate matched 1 out of 2 total files:
  create table "restaurants"(
    "cuisine" varchar,
    "location_city" Array[varchar],
    "menu" Array[Row(
      "items" Array[Row(
        "item" varchar,
        "price" numeric
      )],
      "time" varchar
    )],
    "name" varchar
  );

(1 row)

Examples

In the following example, the input path contains data for a table with two integer columns. The external table definition can be fully inferred, and you can use the returned SQL statement as-is. The function reads one file from the input path:

=> SELECT INFER_TABLE_DDL('/data/orders/*.orc'
    USING PARAMETERS format = 'orc', table_name = 'orders', table_type = 'external');

                INFER_TABLE_DDL
------------------------------------------------------------------------
create external table "orders" (
  "id" int,
  "quantity" int
) as copy from '/data/orders/*.orc' orc;
(1 row)

To create a table in a schema, use the table_schema parameter. Do not add it to the table name; the function treats it as a name with a period in it, not a schema.

The following example shows output with complex types. You can use the definition as-is or modify the VARCHAR sizes:

=> SELECT INFER_TABLE_DDL('/data/people/*.parquet'
    USING PARAMETERS format = 'parquet', table_name = 'employees');
WARNING 9311:  This generated statement contains one or more varchar/varbinary columns which default to length 80

                INFER_TABLE_DDL
------------------------------------------------------------------------
 create table "employees"(
  "employeeID" int,
  "personal" Row(
    "name" varchar,
    "address" Row(
      "street" varchar,
      "city" varchar,
      "zipcode" int
    ),
    "taxID" int
  ),
  "department" varchar
 );
(1 row)

In the following example, the input file contains a map in the "prods" column. You can read a map as an array of rows:

=> SELECT INFER_TABLE_DDL('/data/orders.parquet'
    USING PARAMETERS format='parquet', table_name='orders');
WARNING 9311:  This generated statement contains one or more varchar/varbinary columns which default to length 80
                INFER_TABLE_DDL
------------------------------------------------------------------------
 create table "orders"(
  "orderkey" int,
  "custkey" int,
  "prods" Array[Row(
    "key" varchar,
    "value" numeric(12,2)
  )],
  "orderdate" date
 );
(1 row)

The following example returns the definition of a native table and the COPY statement, putting the table definition on a single line to simplify cutting and pasting into a script:

=> SELECT INFER_TABLE_DDL('/data/orders/*.orc'
    USING PARAMETERS format = 'orc', table_name = 'orders',
                         table_type = 'native', with_copy_statement = true, one_line_result=true);

                INFER_TABLE_DDL
-----------------------------------------------------------------------
create table "orders" ("id" int, "quantity" int);
copy "orders" from '/data/orders/*.orc' orc;
(1 row)

In the following example, the data is partitioned by region. The function was not able to infer the data type and reports UNKNOWN:

=> SELECT INFER_TABLE_DDL('/data/sales/*/*
    USING PARAMETERS format = 'orc', table_name = 'sales', table_type = 'external');
WARNING 9262:  This generated statement is incomplete because of one or more unknown column types. Fix these data types before creating the table
WARNING 9311:  This generated statement contains one or more varchar/varbinary columns which default to length 80

                INFER_TABLE_DDL
------------------------------------------------------------------------
 create external table "sales"(
  "orderkey" int,
  "custkey" int,
  "prodkey" Array[varchar],
  "orderprices" Array[numeric(12,2)],
  "orderdate" date,
  "region" UNKNOWN
 ) as copy from '/data/sales/*/*' PARTITION COLUMNS region orc;
(1 row)

In the following example, the function reads multiple JSON files and they differ in how they represent the menu column:

=> SELECT INFER_TABLE_DDL ('/data/*.json'
    USING PARAMETERS table_name='restaurants', format='json',
max_files=3, max_candidates=3);
WARNING 0:  This generated statement contains one or more float types which might lose precision
WARNING 0:  This generated statement contains one or more varchar/varbinary types which default to length 80

                INFER_TABLE_DDL
------------------------------------------------------------------------
 Candidate matched 1 out of 2 total files:
  create table "restaurants"(
    "cuisine" varchar,
    "location_city" Array[varchar],
    "menu" Array[Row(
      "item" varchar,
      "price" float
    )],
    "name" varchar
  );
Candidate matched 1 out of 2 total files:
  create table "restaurants"(
    "cuisine" varchar,
    "location_city" Array[varchar],
    "menu" Array[Row(
      "items" Array[Row(
        "item" varchar,
        "price" numeric
      )],
      "time" varchar
    )],
    "name" varchar
  );

(1 row)

16.7 - LAST_INSERT_ID

Returns the last value of an IDENTITY column.

Returns the last value of an IDENTITY column. If multiple sessions concurrently load the same table with an IDENTITY column, the function returns the last value generated for that column.

Note

This function works only with IDENTITY columns. It does not work with named sequences.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

LAST_INSERT_ID()

Privileges

Table owner
USAGE privileges on the table schema

Examples

See IDENTITY sequences.

16.8 - PURGE_TABLE

This function was formerly named PURGE_TABLE_PROJECTIONS().

Note

This function was formerly named PURGE_TABLE_PROJECTIONS(). Vertica still supports the former function name.

Permanently removes deleted data from physical storage so disk space can be reused. You can purge historical data up to and including the Ancient History Mark epoch.

Purges all projections of the specified table. You cannot use this function to purge temporary tables.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

PURGE_TABLE ( '[[database.]schema.]table' )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table: The table to purge.

Privileges

Table owner
USAGE privilege on schema

Caution

PURGE_TABLE could temporarily take up significant disk space while the data is being purged.

Examples

The following example purges all projections for the store sales fact table located in the Vmart schema:

=> SELECT PURGE_TABLE('store.store_sales_fact');

16.9 - REBALANCE_TABLE

Synchronously rebalances data in the specified table.

A rebalance operation performs the following tasks:

Distributes data based on:
- User-defined fault groups, if specified
- Large cluster automatic fault groups
Redistributes database projection data across all nodes.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

REBALANCE_TABLE('[[database.]schema.]table-name')

Parameters

schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table-name: The table to rebalance.

Privileges

Superuser

When to rebalance

Rebalancing is useful or even necessary after you perform the following tasks:

Mark one or more nodes as ephemeral in preparation of removing them from the cluster.
Add one or more nodes to the cluster so that Vertica can populate the empty nodes with data.
Change the scaling factor of an elastic cluster, which determines the number of storage containers used to store a projection across the database.
Set the control node size or realign control nodes on a large cluster layout
Add nodes to or remove nodes from a fault group.

Tip

Examples

The following command shows how to rebalance data on the specified table.

=> SELECT REBALANCE_TABLE('online_sales.online_sales_fact');
REBALANCE_TABLE
-------------------
 REBALANCED
(1 row)

16.10 - REENABLE_DUPLICATE_KEY_ERROR

Restores the default behavior of error reporting by reversing the effects of DISABLE_DUPLICATE_KEY_ERROR.

Restores the default behavior of error reporting by reversing the effects of DISABLE_DUPLICATE_KEY_ERROR. Effects are session-scoped.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

REENABLE_DUPLICATE_KEY_ERROR();

Privileges

Superuser

Examples

=> SELECT REENABLE_DUPLICATE_KEY_ERROR();
 REENABLE_DUPLICATE_KEY_ERROR
------------------------------
 Duplicate key error enabled
(1 row)