Vertica has functions to manage various aspects of database operation, such as sessions, privileges, projections, and the catalog.
This is the multi-page printable view of this section. Click here to print.
Management functions
- 1: Catalog functions
- 1.1: DROP_LICENSE
- 1.2: DUMP_CATALOG
- 1.3: EXPORT_CATALOG
- 1.4: EXPORT_OBJECTS
- 1.5: EXPORT_TABLES
- 1.6: INSTALL_LICENSE
- 1.7: MARK_DESIGN_KSAFE
- 1.8: RELOAD_ADMINTOOLS_CONF
- 2: Cloud functions
- 3: Cluster functions
- 3.1: CANCEL_REBALANCE_CLUSTER
- 3.2: DISABLE_LOCAL_SEGMENTS
- 3.3: ENABLE_ELASTIC_CLUSTER
- 3.4: ENABLE_LOCAL_SEGMENTS
- 3.5: REALIGN_CONTROL_NODES
- 3.6: REBALANCE_CLUSTER
- 3.7: RELOAD_SPREAD
- 3.8: SET_CONTROL_SET_SIZE
- 3.9: SET_SCALING_FACTOR
- 3.10: START_REBALANCE_CLUSTER
- 4: Data collector functions
- 4.1: CLEAR_DATA_COLLECTOR
- 4.2: DATA_COLLECTOR_HELP
- 4.3: FLUSH_DATA_COLLECTOR
- 4.4: GET_DATA_COLLECTOR_POLICY
- 4.5: SET_DATA_COLLECTOR_POLICY
- 4.6: SET_DATA_COLLECTOR_TIME_POLICY
- 5: Database functions
- 5.1: CLEAR_RESOURCE_REJECTIONS
- 5.2: COMPACT_STORAGE
- 5.3: DUMP_LOCKTABLE
- 5.4: DUMP_PARTITION_KEYS
- 5.5: GET_CONFIG_PARAMETER
- 5.6: KERBEROS_CONFIG_CHECK
- 5.7: MEMORY_TRIM
- 5.8: PURGE
- 5.9: RUN_INDEX_TOOL
- 5.10: SECURITY_CONFIG_CHECK
- 5.11: SET_CONFIG_PARAMETER
- 5.12: SET_SPREAD_OPTION
- 5.13: SHUTDOWN
- 6: Eon Mode functions
- 6.1: ALTER_LOCATION_SIZE
- 6.2: BACKGROUND_DEPOT_WARMING
- 6.3: CANCEL_DEPOT_WARMING
- 6.4: CANCEL_DRAIN_SUBCLUSTER
- 6.5: CLEAN_COMMUNAL_STORAGE
- 6.6: CLEAR_DATA_DEPOT
- 6.7: CLEAR_DEPOT_ANTI_PIN_POLICY_PARTITION
- 6.8: CLEAR_DEPOT_ANTI_PIN_POLICY_PROJECTION
- 6.9: CLEAR_DEPOT_ANTI_PIN_POLICY_TABLE
- 6.10: CLEAR_DEPOT_PIN_POLICY_PARTITION
- 6.11: CLEAR_DEPOT_PIN_POLICY_PROJECTION
- 6.12: CLEAR_DEPOT_PIN_POLICY_TABLE
- 6.13: CLEAR_FETCH_QUEUE
- 6.14: DEMOTE_SUBCLUSTER_TO_SECONDARY
- 6.15: FINISH_FETCHING_FILES
- 6.16: FLUSH_REAPER_QUEUE
- 6.17: MIGRATE_ENTERPRISE_TO_EON
- 6.18: PROMOTE_SUBCLUSTER_TO_PRIMARY
- 6.19: REBALANCE_SHARDS
- 6.20: RESHARD_DATABASE
- 6.21: SANDBOX_SUBCLUSTER
- 6.22: SET_DEPOT_ANTI_PIN_POLICY_PARTITION
- 6.23: SET_DEPOT_ANTI_PIN_POLICY_PROJECTION
- 6.24: SET_DEPOT_ANTI_PIN_POLICY_TABLE
- 6.25: SET_DEPOT_PIN_POLICY_PARTITION
- 6.26: SET_DEPOT_PIN_POLICY_PROJECTION
- 6.27: SET_DEPOT_PIN_POLICY_TABLE
- 6.28: SHUTDOWN_SUBCLUSTER
- 6.29: SHUTDOWN_WITH_DRAIN
- 6.30: START_DRAIN_SUBCLUSTER
- 6.31: START_REAPING_FILES
- 6.32: SYNC_CATALOG
- 6.33: UNSANDBOX_SUBCLUSTER
- 7: Epoch functions
- 7.1: ADVANCE_EPOCH
- 7.2: GET_AHM_EPOCH
- 7.3: GET_AHM_TIME
- 7.4: GET_CURRENT_EPOCH
- 7.5: GET_LAST_GOOD_EPOCH
- 7.6: MAKE_AHM_NOW
- 7.7: SET_AHM_EPOCH
- 7.8: SET_AHM_TIME
- 8: LDAP link functions
- 8.1: LDAP_LINK_DRYRUN_CONNECT
- 8.2: LDAP_LINK_DRYRUN_SEARCH
- 8.3: LDAP_LINK_DRYRUN_SYNC
- 8.4: LDAP_LINK_SYNC_CANCEL
- 8.5: LDAP_LINK_SYNC_START
- 9: License functions
- 9.1: AUDIT
- 9.2: AUDIT_FLEX
- 9.3: AUDIT_LICENSE_SIZE
- 9.4: AUDIT_LICENSE_TERM
- 9.5: DISPLAY_LICENSE
- 9.6: GET_AUDIT_TIME
- 9.7: GET_COMPLIANCE_STATUS
- 9.8: SET_AUDIT_TIME
- 10: Notifier functions
- 10.1: GET_DATA_COLLECTOR_NOTIFY_POLICY
- 10.2: NOTIFY
- 10.3: SET_DATA_COLLECTOR_NOTIFY_POLICY
- 11: Partition functions
- 11.1: CALENDAR_HIERARCHY_DAY
- 11.2: COPY_PARTITIONS_TO_TABLE
- 11.3: DROP_PARTITIONS
- 11.4: DUMP_PROJECTION_PARTITION_KEYS
- 11.5: DUMP_TABLE_PARTITION_KEYS
- 11.6: MOVE_PARTITIONS_TO_TABLE
- 11.7: PARTITION_PROJECTION
- 11.8: PARTITION_TABLE
- 11.9: PURGE_PARTITION
- 11.10: SWAP_PARTITIONS_BETWEEN_TABLES
- 12: Privileges and access functions
- 12.1: ENABLED_ROLE
- 12.2: GET_PRIVILEGES_DESCRIPTION
- 12.3: HAS_ROLE
- 12.4: RELEASE_SYSTEM_TABLES_ACCESS
- 12.5: RESTRICT_SYSTEM_TABLES_ACCESS
- 13: Projection functions
- 13.1: CLEAR_PROJECTION_REFRESHES
- 13.2: EVALUATE_DELETE_PERFORMANCE
- 13.3: GET_PROJECTION_SORT_ORDER
- 13.4: GET_PROJECTION_STATUS
- 13.5: GET_PROJECTIONS
- 13.6: PURGE_PROJECTION
- 13.7: REFRESH
- 13.8: REFRESH_COLUMNS
- 13.9: START_REFRESH
- 14: Session functions
- 14.1: CANCEL_REFRESH
- 14.2: CLOSE_ALL_SESSIONS
- 14.3: CLOSE_SESSION
- 14.4: CLOSE_USER_SESSIONS
- 14.5: GET_NUM_ACCEPTED_ROWS
- 14.6: GET_NUM_REJECTED_ROWS
- 14.7: INTERRUPT_STATEMENT
- 14.8: RELEASE_ALL_JVM_MEMORY
- 14.9: RELEASE_JVM_MEMORY
- 14.10: RESERVE_SESSION_RESOURCE
- 14.11: RESET_SESSION
- 15: Storage functions
- 15.1: ALTER_LOCATION_LABEL
- 15.2: ALTER_LOCATION_USE
- 15.3: CLEAR_CACHES
- 15.4: CLEAR_OBJECT_STORAGE_POLICY
- 15.5: DO_TM_TASK
- 15.6: DROP_LOCATION
- 15.7: ENFORCE_OBJECT_STORAGE_POLICY
- 15.8: MEASURE_LOCATION_PERFORMANCE
- 15.9: MOVE_RETIRED_LOCATION_DATA
- 15.10: RESTORE_LOCATION
- 15.11: RETIRE_LOCATION
- 15.12: SET_LOCATION_PERFORMANCE
- 15.13: SET_OBJECT_STORAGE_POLICY
- 16: Table functions
- 16.1: ANALYZE_CONSTRAINTS
- 16.2: ANALYZE_CORRELATIONS
- 16.3: COPY_TABLE
- 16.4: DISABLE_DUPLICATE_KEY_ERROR
- 16.5: INFER_EXTERNAL_TABLE_DDL
- 16.6: INFER_TABLE_DDL
- 16.7: LAST_INSERT_ID
- 16.8: PURGE_TABLE
- 16.9: REBALANCE_TABLE
- 16.10: REENABLE_DUPLICATE_KEY_ERROR
1 - Catalog functions
This section contains catalog management functions specific to Vertica.
1.1 - DROP_LICENSE
Drops a license key from the global catalog. Dropping expired keys is optional. Vertica automatically ignores expired license keys if a valid, alternative license key is installed.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
DROP_LICENSE( 'license-name' )
Parameters
license-name
- The name of the license to drop. Use the name (or long license key) in the
NAME
column of system tableLICENSES
.
Privileges
Superuser
Examples
=> SELECT DROP_LICENSE('9b2d81e2-aab1-4cfb-bc07-fa9a696e8f5e');
See also
Managing licenses1.2 - DUMP_CATALOG
Returns an internal representation of the Vertica catalog. This function is used for diagnostic purposes.
DUMP_CATALOG
returns only the objects that are visible to the user.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
DUMP_CATALOG()
Privileges
None
Examples
The following query obtains an internal representation of the Vertica catalog:
=> SELECT DUMP_CATALOG();
The output is written to the specified file:
\o /tmp/catalog.txt
SELECT DUMP_CATALOG();
\o
1.3 - EXPORT_CATALOG
Note
This function and EXPORT_OBJECTS return equivalent output.Generates a SQL script for recreating a physical schema design on another cluster.
The SQL script conforms to the following requirements:
-
Only includes objects to which the user has access.
-
Orders CREATE statements according to object dependencies so they can be recreated in the correct sequence. For example, if a table is in a non-PUBLIC schema, the required CREATE SCHEMA statement precedes the CREATE TABLE statement. Similarly, a table's CREATE ACCESS POLICY statement follows the table's CREATE TABLE statement.
-
If possible, creates projections with their KSAFE clause, if any, otherwise with their OFFSET clause.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
EXPORT_CATALOG ( [ '[destination]' [, 'scope'] ] )
Arguments
Note
If you omit all arguments, this function exports to standard output all objects to which you have access.destination
- Specifies where to send output, one of the following:
-
Empty string, writes the script to standard output.
-
Path and name of an SQL output file, valid only for superusers. If you specify a file that does not exist, the function creates one. If you specify only a file name, Vertica creates it in the catalog directory. If the file already exists, the function silently overwrites its contents.
-
scope
- Determines what to export. Within the specified scope, EXPORT_CATALOG exports all the objects to which you have access:
-
DESIGN: Exports all catalog objects, including schemas, tables, constraints, views, access policies, projections, SQL macros, and stored procedures.
-
DESIGN_ALL: Deprecated.
-
TABLES: Exports all tables and their access policies. See also EXPORT_TABLES.
-
DIRECTED_QUERIES: Exports all directed queries that are stored in the database. For details, see Managing directed queries.
Default: DESIGN
-
Privileges
None
Examples
See also
1.4 - EXPORT_OBJECTS
Note
This function and EXPORT_CATALOG return equivalent output.Generates a SQL script you can use to recreate non-virtual catalog objects on another cluster.
The SQL script conforms to the following requirements:
-
Only includes objects to which the user has access.
-
Orders CREATE statements according to object dependencies so they can be recreated in the correct sequence. For example, if a table is in a non-PUBLIC schema, the required CREATE SCHEMA statement precedes the CREATE TABLE statement. Similarly, a table's CREATE ACCESS POLICY statement follows the table's CREATE TABLE statement.
-
If possible, creates projections with their KSAFE clause, if any, otherwise with their OFFSET clause.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
EXPORT_OBJECTS( ['[destination]' [, '[scope]'] [, 'mark-ksafe']] )
Parameters
Note
If you omit all parameters, this function exports to standard output all objects to which you have access.destination
- Specifies where to send output, one of the following:
-
An empty string (
''
) writes the script to standard output. -
The path and name of a SQL output file. This option is valid only for superusers. If you specify a file that does not exist, the function creates one. If you specify only a file name, Vertica creates it in the catalog directory. If the file already exists, the function silently overwrites its contents.
-
scope
- Specifies one or more objects to export as a comma-delimited list:
{ [database.]schema[.object] | [[database.]schema.]object }[,...]
-
If set to an empty string, Vertica exports all objects to which the user has access.
-
If you specify a schema only, Vertica exports all objects in that schema.
-
If you specify a database, it must be the current database.
For stored procedures with the same name but different formal parameters, you can export all implementations by exporting its parent schema:
mydb.myschema
Specifying the types or both the names and types of a particular implementation's formal parameters exports that implementation:
mydb.myschema.my_procedure() -- no formal parameters mydb.myschema.my_procedure(int, int) -- formal parameter types (parameter names are optional)
-
mark-ksafe
- Boolean argument, specifies whether the generated script calls the Vertica function MARK_DESIGN_KSAFE . If set to true (default), MARK_DESIGN_KSAFE uses the correct K-safe argument for the current database.
Privileges
None
Examples
See Exporting objects.
See also
1.5 - EXPORT_TABLES
Generates a SQL script that can be used to recreate a logical schema—schemas, tables, constraints, and views—on another cluster. EXPORT_TABLES only exports objects to which the user has access.
The SQL script conforms to the following requirements:
-
Only includes objects to which the user has access.
-
Orders CREATE statements according to object dependencies so they can be recreated in the correct sequence. For example, if a table references a named sequence, a CREATE SEQUENCE statement precedes the CREATE TABLE statement. Similarly, a table's CREATE ACCESS POLICY statement follows the table's CREATE TABLE statement.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
EXPORT_TABLES( ['[destination]' [, '[scope]']] )
Note
If you omit all parameters, EXPORT_CATALOG exports to standard output all tables to which you have access.Parameters
destination
- Specifies where to send output, one of the following:
-
An empty string (
''
) writes the script to standard output. -
The path and name of a SQL output file. This option is valid only for superusers. If you specify a file that does not exist, the function creates one. If you specify only a file name, Vertica creates it in the catalog directory. If the file already exists, the function silently overwrites its contents.
-
scope
- Specifies one or more tables to export, as follows:
[database.]schema[.table][,...]
- If set to an empty string, Vertica exports all non-virtual table objects to which you have access, including table schemas, sequences, and constraints.
- If you specify a schema, Vertica exports all non-virtual table objects in that schema.
- If you specify a database, it must be the current database.
Privileges
None
Examples
See Exporting tables.
See also
1.6 - INSTALL_LICENSE
Installs the license key in the global catalog.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
INSTALL_LICENSE( 'filename' )
Parameters
filename
- The absolute path name of a valid license file.
Privileges
Superuser
Examples
=> SELECT INSTALL_LICENSE('/tmp/vlicense.dat');
See also
Managing licenses1.7 - MARK_DESIGN_KSAFE
Enables or disables high availability in your environment, in case of a failure. Before enabling recovery, MARK_DESIGN_KSAFE
queries the catalog to determine whether a cluster's physical schema design meets the following requirements:
-
Small, unsegmented tables are replicated on all nodes.
-
Large table superprojections are segmented with each segment on a different node.
-
Each large table projection has at least one buddy projection for K-safety=1 (or two buddy projections for K-safety=2).
Buddy projections are also segmented across database nodes, but the distribution is modified so segments that contain the same data are distributed to different nodes. See High availability with projections.
MARK_DESIGN_KSAFE
does not change the physical schema.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
MARK_DESIGN_KSAFE ( k )
Parameters
k
- Specifies the level of K-safety, one of the following:
-
2: Enables high availability if the schema design meets requirements for K-safety=2
-
1: Enables high availability if the schema design meets requirements for K-safety=1
-
0: Disables high availability
-
Privileges
Superuser
Return messages
If you specify a k
value of 1 or 2, Vertica returns one of the following messages.
Success:
Marked design n-safe
Failure:
The schema does not meet requirements for K=n.
Fact table projection projection-name
has insufficient "buddy" projections.
where n
is a K-safety setting.
Notes
-
The database's internal recovery state persists across database restarts but it is not checked at startup time.
-
When one node fails on a system marked K-safe=1, the remaining nodes are available for DML operations.
Examples
=> SELECT MARK_DESIGN_KSAFE(1);
mark_design_ksafe
----------------------
Marked design 1-safe
(1 row)
If the physical schema design is not K-safe, messages indicate which projections do not have a buddy:
=> SELECT MARK_DESIGN_KSAFE(1);
The given K value is not correct;
the schema is 0-safe
Projection pp1 has 0 buddies,
which is smaller that the given K of 1
Projection pp2 has 0 buddies,
which is smaller that the given K of 1
.
.
.
(1 row)
See also
1.8 - RELOAD_ADMINTOOLS_CONF
Updates the admintools.conf on each UP node in the cluster. Updates include:
-
IP addresses and catalog paths
-
Node names for all nodes in the current database
This function provides a manual method to instruct the server to update admintools.conf on all UP nodes. For example, if you restart a node, call this function to confirm its admintools.conf file is accurate.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
RELOAD_ADMINTOOLS_CONF()
Privileges
SuperuserExamples
Update admintools.conf on each UP node in the cluster:
=> SELECT RELOAD_ADMINTOOLS_CONF();
RELOAD_ADMINTOOLS_CONF
--------------------------
admintools.conf reloaded
(1 row)
2 - Cloud functions
This section contains functions for managing cloud integrations. See also Hadoop functions for HDFS.
2.1 - AZURE_TOKEN_CACHE_CLEAR
Clears the cached access token for Azure. Call this function after changing the configuration of Azure managed identities.
An Azure object store can support and manage multiple identities. If multiple identities are in use, Vertica looks for an Azure tag with a key of VerticaManagedIdentityClientId, the value of which must be the client_id attribute of the managed identity to be used. If the Azure configuration changes, use this function to clear the cache.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
AZURE_TOKEN_CACHE_CLEAR ( )
Privileges
Superuser
3 - Cluster functions
This section contains functions that manage spread deployment on large, distributed database clusters and functions that control how the cluster organizes data for rebalancing.
3.1 - CANCEL_REBALANCE_CLUSTER
Stops any rebalance task that is currently in progress or is waiting to execute.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CANCEL_REBALANCE_CLUSTER()
Privileges
Superuser
Examples
=> SELECT CANCEL_REBALANCE_CLUSTER();
CANCEL_REBALANCE_CLUSTER
--------------------------
CANCELED
(1 row)
See also
3.2 - DISABLE_LOCAL_SEGMENTS
Disables local data segmentation, which breaks projections segments on nodes into containers that can be easily moved to other nodes. See Local data segmentation for details.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
DISABLE_LOCAL_SEGMENTS()
Privileges
Superuser
Examples
=> SELECT DISABLE_LOCAL_SEGMENTS();
DISABLE_LOCAL_SEGMENTS
------------------------
DISABLED
(1 row)
3.3 - ENABLE_ELASTIC_CLUSTER
Enables elastic cluster scaling, which makes enlarging or reducing the size of your database cluster more efficient by segmenting a node's data into chunks that can be easily moved to other hosts.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
ENABLE_ELASTIC_CLUSTER()
Privileges
Superuser
Examples
=> SELECT ENABLE_ELASTIC_CLUSTER();
ENABLE_ELASTIC_CLUSTER
------------------------
ENABLED
(1 row)
3.4 - ENABLE_LOCAL_SEGMENTS
Enables local storage segmentation, which breaks projections segments on nodes into containers that can be easily moved to other nodes. See Local data segmentation for more information.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
ENABLE_LOCAL_SEGMENTS()
Privileges
Superuser
Examples
=> SELECT ENABLE_LOCAL_SEGMENTS();
ENABLE_LOCAL_SEGMENTS
-----------------------
ENABLED
(1 row)
3.5 - REALIGN_CONTROL_NODES
Causes Vertica to re-evaluate which nodes in the cluster or subcluster are control nodes and which nodes are assigned to them as dependents when large cluster is enabled. Call this function after altering fault groups in an Enterprise Mode database, or changing the number of control nodes in either database mode. After calling this function, query the
V_CATALOG.CLUSTER_LAYOUT
system table to see the proposed new layout for nodes in the cluster. You must also take additional steps before the new control node assignments take effect. See Changing the number of control nodes and realigning for details.
Note
In Vertica versions prior to 10.0.1, control node assignments weren't restricted to be within the same Eon Mode subcluster. If you attempt to realign control nodes in a subcluster whose control nodes have dependents in other subclusters, this function returns an error. In this case, you must realign the control nodes in those other subclusters first. Realigning the other subclusters fixes the cross-subcluster dependencies, allowing you to realign the control nodes in the original subcluster you attempted to realign.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
In Enterprise Mode:
REALIGN_CONTROL_NODES()
In Eon Mode:
REALIGN_CONTROL_NODES('subcluster_name')
Parameters
subcluster_name
- The name of the subcluster where you want to realign control nodes. Only the nodes in this subcluster are affected. Other subclusters are unaffected. Only allowed when the database is running in Eon Mode.
Privileges
SuperuserExamples
In an Enterprise Mode database, choose control nodes from all nodes and assign the remaining nodes to a control node:
=> SELECT REALIGN_CONTROL_NODES();
In an Eon Mode database, re-evaluate the control node assignments in the subcluster named analytics:
=> SELECT REALIGN_CONTROL_NODES('analytics');
See also
3.6 - REBALANCE_CLUSTER
Rebalances the database cluster synchronously as a session foreground task. REBALANCE_CLUSTER returns only after the rebalance operation is complete. If the current session ends, the operation immediately aborts. To rebalance the cluster as a background task, call START_REBALANCE_CLUSTER.
On large cluster arrangements, you typically call REBALANCE_CLUSTER in a flow (see Changing the number of control nodes and realigning). After you change the number and distribution of control nodes (spread hosts), run REBALANCE_CLUSTER to achieve fault tolerance.
For detailed information about rebalancing tasks, see Rebalancing data across nodes.
Tip
By default, before performing a rebalance, Vertica queries system tables to compute the size of all projections involved in the rebalance task. This query can add significant overhead to the rebalance operation. To disable this query, set projection configuration parameter RebalanceQueryStorageContainers to 0.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
REBALANCE_CLUSTER()
Privileges
Superuser
Examples
=> SELECT REBALANCE_CLUSTER();
REBALANCE_CLUSTER
-------------------
REBALANCED
(1 row)
3.7 - RELOAD_SPREAD
Updates cluster changes to the catalog's Spread configuration file. These changes include:
-
New or realigned control nodes
-
New Spread hosts or fault group
-
New or dropped cluster nodes
This function is often used in a multi-step process for large and elastic cluster arrangements. Calling it might require you to restart the database. You must then rebalance the cluster to realize fault tolerance. For details, see Defining and Realigning Control Nodes.
Caution
In an Eon Mode database, using this function could result in the database becoming read-only. Nodes may become disconnected after you call this function. If the database no longer has primary shard coverage without these nodes, it goes into read-only mode to maintain data integrity. Once the nodes rejoin the cluster, the database will resume normal operation. See Maintaining Shard Coverage.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
RELOAD_SPREAD( true )
Parameters
true
- Updates cluster changes related to control message responsibilities to the Spread configuration file.
Privileges
SuperuserExamples
Update the cluster with changes to control messaging:
=> SELECT reload_spread(true);
reload_spread
---------------
reloaded
(1 row)
See also
REBALANCE_CLUSTER
3.8 - SET_CONTROL_SET_SIZE
Sets the number of control nodes that participate in the spread service when large cluster is enabled. If the database is running in Enterprise Mode, this function sets the number of control nodes for the entire database cluster. If the database is running in Eon Mode, this function sets the number of control nodes in the subcluster you specify. See Large cluster for more information.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
In Enterprise Mode:
SET_CONTROL_SET_SIZE( control_nodes )
In Eon Mode:
SET_CONTROL_SET_SIZE('subcluster_name', control_nodes )
Parameters
subcluster_name
- The name of the subcluster where you want to set the number of control nodes. Only allowed when the database is running in Eon Mode.
control_nodes
- The number of control nodes to assign to the cluster (when in Enterprise Mode) or subcluster (when in Eon Mode). Value can be one of the following:
-
Positive integer value: Vertica assigns the number of control nodes you specify to the cluster or subcluster. This value can be larger than the current node count. This value cannot be larger than 120 (the maximum number of control nodes for a database). In Eon Mode, the total of this value plus the number of control nodes set for all other subclusters cannot be more than 120.
-
-1
: Makes every node in the cluster or subcluster into control nodes. This value effectively disables large cluster for the cluster or subcluster.
-
Privileges
SuperuserExamples
In an Enterprise Mode database, set the number of control nodes for the entire cluster to 5:
=> SELECT set_control_set_size(5);
SET_CONTROL_SET_SIZE
----------------------
Control size set
(1 row)
See also
3.9 - SET_SCALING_FACTOR
Sets the scaling factor that determines the number of storage containers used when rebalancing the database and when using local data segmentation is enabled. See Cluster Scaling for details.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SET_SCALING_FACTOR( factor )
Parameters
factor
- An integer value between 1 and 32. Vertica uses this value to calculate the number of storage containers each projection is broken into when rebalancing or when local data segmentation is enabled.
Privileges
Superuser
Best practices
The scaling factor determines the number of storage containers that Vertica uses to store each projection across the database during rebalancing when local segmentation is enabled. When setting the scaling factor, follow these guidelines:
-
The number of storage containers should be greater than or equal to the number of partitions multiplied by the number of local segments:
num-storage-containers
>= (
num-partitions
*
num-local-segments
) -
Set the scaling factor high enough so rebalance can transfer local segments to satisfy the skew threshold, but small enough so the number of storage containers does not result in too many ROS containers, and cause ROS pushback. The maximum number of ROS containers (by default 1024) is set by configuration parameter ContainersPerProjectionLimit.
Examples
=> SELECT SET_SCALING_FACTOR(12);
SET_SCALING_FACTOR
--------------------
SET
(1 row)
3.10 - START_REBALANCE_CLUSTER
Asynchronously rebalances the database cluster as a background task. This function returns immediately after the rebalancing operation is complete. Rebalancing persists until the operation is complete, even if you close the current session or the database shuts down. In the case of shutdown, rebalancing resumes after the cluster restarts. To stop the rebalance operation, call
CANCEL_REBALANCE_CLUSTER
.
For detailed information about rebalancing tasks, see Rebalancing data across nodes.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
START_REBALANCE_CLUSTER()
Privileges
Superuser
Examples
=> SELECT START_REBALANCE_CLUSTER();
START_REBALANCE_CLUSTER
-------------------------
REBALANCING
(1 row)
See also
REBALANCE_CLUSTER
4 - Data collector functions
The Vertica Data Collector is a utility that extends system table functionality by providing a framework for recording events. It gathers and retains monitoring information about your database cluster and makes that information available in system tables, requiring few configuration parameter tweaks, and having negligible impact on performance.
Collected data is stored on disk in the DataCollector
directory under the Vertica /catalog path. You can use the information the Data Collector retains to query the past state of system tables and extract aggregate information, as well as do the following:
-
See what actions users have taken
-
Locate performance bottlenecks
-
Identify potential improvements to Vertica configuration
Data Collector works in conjunction with an advisor tool called Workload Analyzer, which intelligently monitors the performance of SQL queries and workloads and recommends tuning actions based on observations of the actual workload history.
By default, Data Collector is on and retains information for all sessions. If performance issues arise, a superuser can disable Data Collector by setting set configuration parameter EnableDataCollector to 0.
4.1 - CLEAR_DATA_COLLECTOR
Clears all memory and disk records from Data Collector tables and logs, and resets collection statistics in system table DATA_COLLECTOR.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CLEAR_DATA_COLLECTOR( [ 'component' ] )
Parameters
component
- Clears memory and disk records for the specified component. If you provide no argument, the function clears memory and disk records for all components.
Query system table DATA_COLLECTOR for component names. For example:
=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component; component | description ----------------+------------------------------- DepotEvictions | Files evicted from the Depot DepotFetches | Files fetched to the Depot DepotUploads | Files Uploaded from the Depot (3 rows)
Privileges
Superuser
Examples
The following command clears memory and disk records for the ResourceAcquisitions component:
=> SELECT clear_data_collector('ResourceAcquisitions');
clear_data_collector
----------------------
CLEAR
(1 row)
The following command clears data collection for all components:
=> SELECT clear_data_collector();
clear_data_collector
----------------------
CLEAR
(1 row)
See also
Data collector utility4.2 - DATA_COLLECTOR_HELP
Returns online usage instructions about the Data Collector, the
DATA_COLLECTOR
system table, and the Data Collector control functions.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
DATA_COLLECTOR_HELP()
Privileges
None
Returns
The DATA_COLLECTOR_HELP()
function returns the following information:
=> SELECT DATA_COLLECTOR_HELP();
-----------------------------------------------------------------------------
Usage Data Collector
The data collector retains history of important system activities.
This data can be used as a reference of what actions have been taken
by users, but it can also be used to locate performance bottlenecks,
or identify potential improvements to the Vertica configuration.
This data is queryable via Vertica system tables.
Acccess a list of data collector components, and some statistics, by running:
SELECT * FROM v_monitor.data_collector;
The amount of data retained by size and time can be controlled with several
functions.
To just set the size amount:
set_data_collector_policy(<component>,
<memory retention (KB)>,
<disk retention (KB)>);
To set both the size and time amounts (the smaller one will dominate):
set_data_collector_policy(<component>,
<memory retention (KB)>,
<disk retention (KB)>,
<interval>);
To set just the time amount:
set_data_collector_time_policy(<component>,
<interval>);
To set the time amount for all tables:
set_data_collector_time_policy(<interval>);
The current retention policy for a component can be queried with:
get_data_collector_policy(<component>);
Data on disk is kept in the "DataCollector" directory under the Vertica
\catalog path. This directory also contains instructions on how to load
the monitoring data into another Vertica database.
To move the data collector logs and instructions to other storage locations,
create labeled storage locations using add_location and then use:
set_data_collector_storage_location(<storage_label>);
Additional commands can be used to configure the data collection logs.
The log can be cleared with:
clear_data_collector([<optional component>]);
The log can be synchronized with the disk storage using:
flush_data_collector([<optional component>]);
See also
4.3 - FLUSH_DATA_COLLECTOR
Waits until memory logs are moved to disk and then flushes the Data Collector, synchronizing the log with disk storage.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
FLUSH_DATA_COLLECTOR( [ 'component' ] )
Parameters
component
- Flushes data for the specified component. If you omit this argument, the function flushes data for all components.
Query system table DATA_COLLECTOR for component names. For example:
=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component; component | description ----------------+------------------------------- DepotEvictions | Files evicted from the Depot DepotFetches | Files fetched to the Depot DepotUploads | Files Uploaded from the Depot (3 rows)
Privileges
Superuser
Examples
The following command flushes the Data Collector for the ResourceAcquisitions component:
=> SELECT flush_data_collector('ResourceAcquisitions');
flush_data_collector
----------------------
FLUSH
(1 row)
The following command flushes data collection for all components:
=> SELECT flush_data_collector();
flush_data_collector
----------------------
FLUSH
(1 row)
See also
Data collector utility4.4 - GET_DATA_COLLECTOR_POLICY
Retrieves a brief statement about the retention policy for the specified component.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
GET_DATA_COLLECTOR_POLICY( 'component' )
Parameters
component
- Returns the retention policy of the specified component.
Query system table DATA_COLLECTOR for component names. For example:
=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component; component | description ----------------+------------------------------- DepotEvictions | Files evicted from the Depot DepotFetches | Files fetched to the Depot DepotUploads | Files Uploaded from the Depot (3 rows)
Privileges
None
Examples
The following query returns the history of all resource acquisitions by specifying the ResourceAcquisitions
component:
=> SELECT get_data_collector_policy('ResourceAcquisitions');
get_data_collector_policy
----------------------------------------------
1000KB kept in memory, 10000KB kept on disk.
(1 row)
See also
4.5 - SET_DATA_COLLECTOR_POLICY
Updates the following retention policy properties for the specified component:
-
MEMORY_BUFFER_SIZE_KB
-
DISK_SIZE_KB
-
INTERVAL_TIME
Before you change a retention policy, you can view its current settings by querying system table DATA_COLLECTOR or by calling meta-function GET_DATA_COLLECTOR_POLICY.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SET_DATA_COLLECTOR_POLICY('component', 'memory-buffer-size', 'disk-size' [,'interval-time'] )
Parameters
component
- Specifies the retention policy to update.
Query system table DATA_COLLECTOR for component names. For example:
=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component; component | description ----------------+------------------------------- DepotEvictions | Files evicted from the Depot DepotFetches | Files fetched to the Depot DepotUploads | Files Uploaded from the Depot (3 rows)
memory-buffer-size
- Specifies in kilobytes the maximum amount of data that is buffered in memory before moving it to disk. The policy retention policy property MEMORY_BUFFER_SIZE_KB is set from this value.
Caution
If you set this parameter to 0, the function returns with a warning that the Data Collector cannot retain any data for this component in memory or on disk.Consider setting this parameter to a high value in the following cases:
-
Unusually high levels of data collection. If
memory-buffer-size
is set too low, the Data Collector might be unable to flush buffered data to disk fast enough to keep up with the activity level, which can lead to loss of in-memory data. -
Very large data collector records—for example, records with very long query strings. The Data Collector uses double-buffering, so it cannot retain in memory records that are more than 50 percent larger than
memory-buffer-size
.
-
disk-size
- Specifies in kilobytes the maximum disk space allocated for this component's Data Collector table. The policy retention policy property DISK_SIZE_KB is set from this value. If set to 0, the Data Collector retains only as much component data as it can buffer in memory, as specified by
memory-buffer-size
. interval-time
INTERVAL data type that specifies how long data of a given component is retained in that component's Data Collector table. The retention policy property INTERVAL_TIME is set from this value. If you set this parameter to a positive value, it also changes the policy property INTERVAL_SET to t (true).
For example, if you specify component TupleMoverEvents and set interval-time to an interval of two days (
'2 days'::interval
), the Data Collector tabledc_tuple_mover_events
retains records of Tuple Mover activity over the last 48 hours. Older Tuple Mover data are automatically dropped from this table.Note
Setting a component's policy's INTERVAL_TIME property has no effect on how much data storage the Data Collector retains on disk for that component. Maximum disk storage capacity is determined by the DISK_SIZE_KB property. Setting the INTERVAL_TIME property only affects how long data is retained by the component's Data Collector table. For details, see Configuring data retention policies.To disable the INTERVAL_TIME policy property, set this parameter to a negative integer. Doing so reverts two retention policy properties to their default settings:
-
INTERVAL_SET: f
-
INTERVAL_TIME: 0
With these two properties thus set, the component's Data Collector table retains data on all component events until it reaches its maximum limit, as set by retention policy property DISK_SIZE_KB.
-
Privileges
Superuser
Examples
4.6 - SET_DATA_COLLECTOR_TIME_POLICY
Updates the retention policy property INTERVAL_TIME for the specified component. Calling this function has no effect on other properties of the same component. You can use this function to update the INTERVAL_TIME property of all component retention policies.
To set other retention policy properties, call SET_DATA_COLLECTOR_POLICY.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SET_DATA_COLLECTOR_TIME_POLICY( ['component',] 'interval-time' )
Parameters
component
- Specifies the retention policy to update. If you omit this argument, Vertica updates the retention policy of all Data Collector components.
Query system table DATA_COLLECTOR for component names. For example:
=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component; component | description ----------------+------------------------------- DepotEvictions | Files evicted from the Depot DepotFetches | Files fetched to the Depot DepotUploads | Files Uploaded from the Depot (3 rows)
interval-time
INTERVAL data type that specifies how long data of a given component is retained in that component's Data Collector table. The retention policy property INTERVAL_TIME is set from this value. If you set this parameter to a positive value, it also changes the policy property INTERVAL_SET to t (true).
For example, if you specify component TupleMoverEvents and set interval-time to an interval of two days (
'2 days'::interval
), the Data Collector tabledc_tuple_mover_events
retains records of Tuple Mover activity over the last 48 hours. Older Tuple Mover data are automatically dropped from this table.Note
Setting a component's policy's INTERVAL_TIME property has no effect on how much data storage the Data Collector retains on disk for that component. Maximum disk storage capacity is determined by the DISK_SIZE_KB property. Setting the INTERVAL_TIME property only affects how long data is retained by the component's Data Collector table. For details, see Configuring data retention policies.To disable the INTERVAL_TIME policy property, set this parameter to a negative integer. Doing so reverts two retention policy properties to their default settings:
-
INTERVAL_SET: f
-
INTERVAL_TIME: 0
With these two properties thus set, the component's Data Collector table retains data on all component events until it reaches its maximum limit, as set by retention policy property DISK_SIZE_KB.
-
Privileges
Superuser
Examples
5 - Database functions
This section contains the database management functions specific to Vertica.
5.1 - CLEAR_RESOURCE_REJECTIONS
Clears the content of the RESOURCE_REJECTIONS and DISK_RESOURCE_REJECTIONS system tables. Normally, these tables are only cleared during a node restart. This function lets you clear the tables whenever you need. For example, you might want to clear the system tables after you resolved a disk space issue that was causing disk resource rejections.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
ImmutableSyntax
CLEAR_RESOURCE_REJECTIONS();
Privileges
Superuser
Examples
The following command clears the content of the RESOURCE_REJECTIONS and DISK_RESOURCE_REJECTIONS system tables:
=> SELECT clear_resource_rejections();
clear_resource_rejections
---------------------------
OK
(1 row)
See also
5.2 - COMPACT_STORAGE
Bundles existing data (.fdb
) and index (.pidx
) files into the .gt
file format. The .gt
format is enabled by default for data files created version 7.2 or later. If you upgrade a database from an earlier version, use COMPACT_STORAGE
to bundle storage files into the .gt
format. Your database can continue to operate with a mix of file storage formats.
If the settings you specify for COMPACT_STORAGE
vary from the limit specified in configuration parameter MaxBundleableROSSizeKB
, Vertica does not change the size of the automatically created bundles.
Note
Run this function during periods of low demand.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SELECT COMPACT_STORAGE ('[[[database.]schema.]object-name]', min-ros-filesize-kb, 'small-or-all-files', 'simulate');
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.object-name
- Specifies the table or projection to bundle. If set to an empty string, COMPACT_STORAGE evaluates the data of all projections in the database for bundling.
min-ros-filesize-kb
- Integer ≥ 1, specifies in kilobytes the minimum size of an independent ROS file. COMPACT_STORAGE bundles storage container ROS files below this size into a single file.
small-or-all-files
- One of the following:
-
small
: Bundles only files smaller than the limit specified inmin-ros-filesize-kb
-
all
: Bundles files smaller than the limit specified inmin-ros-filesize-kb
and bundles the.fdb
and.pidx
files for larger storage containers.
-
simulate
- Specifies whether to simulate the storage settings and produce a report describing the impact of those settings.
-
true
: Produces a report on the impact of the specified bundle settings without actually bundling storage files. -
false
: Performs the bundling as specified.
-
Privileges
SuperuserStorage and performance impact
Bundling reduces the number of files in your file system by at least fifty percent and improves the performance of file-intensive operations. Improved operations include backups, restores, and mergeout.
Vertica creates small files for the following reasons:
-
Tables contain hundreds of columns.
-
Partition ranges are small (partition by minute).
-
Local segmentation is enabled and your factor is set to a high value.
Examples
The following example describes the impact of bundling the table EMPLOYEES
:
=> SELECT COMPACT_STORAGE('employees', 1024,'small','true');
Task: compact_storage
On node v_vmart_node0001:
Projection Name :public.employees_b0 | selected_storage_containers :0 |
selected_files_to_compact :0 | files_after_compact : 0 | modified_storage_KB :0
On node v_vmart_node0002:
Projection Name :public.employees_b0 | selected_storage_containers :1 |
selected_files_to_compact :6 | files_after_compact : 1 | modified_storage_KB :0
On node v_vmart_node0003:
Projection Name :public.employees_b0 | selected_storage_containers :2 |
selected_files_to_compact :12 | files_after_compact : 2 | modified_storage_KB :0
On node v_vmart_node0001:
Projection Name :public.employees_b1 | selected_storage_containers :2 |
selected_files_to_compact :12 | files_after_compact : 2 | modified_storage_KB :0
On node v_vmart_node0002:
Projection Name :public.employees_b1 | selected_storage_containers :0 |
selected_files_to_compact :0 | files_after_compact : 0 | modified_storage_KB :0
On node v_vmart_node0003:
Projection Name :public.employees_b1 | selected_storage_containers :1 |
selected_files_to_compact :6 | files_after_compact : 1 | modified_storage_KB :0
Success
(1 row)
5.3 - DUMP_LOCKTABLE
Returns information about deadlocked clients and the resources they are waiting for.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
DUMP_LOCKTABLE()
Privileges
None
Notes
Use DUMP_LOCKTABLE if Vertica becomes unresponsive:
-
Open an additional vsql connection.
-
Execute the query:
=> SELECT DUMP_LOCKTABLE();
The output is written to vsql. See Monitoring the Log Files.
You can also see who is connected using the following command:
=> SELECT * FROM SESSIONS;
Close all sessions using the following command:
=> SELECT CLOSE_ALL_SESSIONS();
Close a single session using the following command:
=> SELECT CLOSE_SESSION('session_id');
You get the session_id value from the V_MONITOR.SESSIONS system table.
See also
5.4 - DUMP_PARTITION_KEYS
Dumps the partition keys of all projections in the system.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
DUMP_PARTITION_KEYS( )
Note
The ROS objects of partitioned tables without partition keys are ignored by the tuple mover and are not merged during automatic tuple mover operations.Privileges
User must have select privileges on the table or usage privileges on the schema.
Examples
=> SELECT DUMP_PARTITION_KEYS( );
Partition keys on node v_vmart_node0001
Projection 'states_b0'
Storage [ROS container]
No of partition keys: 1
Partition keys: NH
Storage [ROS container]
No of partition keys: 1
Partition keys: MA
Projection 'states_b1'
Storage [ROS container]
No of partition keys: 1
Partition keys: VT
Storage [ROS container]
No of partition keys: 1
Partition keys: ME
Storage [ROS container]
No of partition keys: 1
Partition keys: CT
See also
- DUMP_PROJECTION_PARTITION_KEYS
- DUMP_TABLE_PARTITION_KEYS
- PARTITION_PROJECTION
- PARTITION_TABLE
- PARTITIONS
-
Partitioning tables in the Administrator's Guide
5.5 - GET_CONFIG_PARAMETER
Gets the value of a configuration parameter at the specified level. If no value is set at that level, the function returns an empty row.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
GET_CONFIG_PARAMETER( 'parameter-name' [, 'level' | NULL] )
Parameters
parameter-name
- Name of the configuration parameter value to get.
level
- Level at which to get
parameter-name
's setting, one of the following string values:-
user
: Current user -
session
: Current session -
node-name
: Name of database node
If
level
is omitted or set to NULL, GET_CONFIG_PARAMETER returns the database setting. -
Privileges
None
Examples
Get the AnalyzeRowCountInterval parameter at the database level:
=> SELECT GET_CONFIG_PARAMETER ('AnalyzeRowCountInterval');
GET_CONFIG_PARAMETER
----------------------
3600
Get the MaxSessionUDParameterSize parameter at the session level:
=> SELECT GET_CONFIG_PARAMETER ('MaxSessionUDParameterSize','session');
GET_CONFIG_PARAMETER
----------------------
2000
(1 row)
Get the UseDepotForReads parameter at the user level:
=> SELECT GET_CONFIG_PARAMETER ('UseDepotForReads', 'user');
GET_CONFIG_PARAMETER
----------------------
1
(1 row)
See also
5.6 - KERBEROS_CONFIG_CHECK
Tests the Kerberos configuration of a Vertica cluster. The function succeeds if it can kinit with both the keytab file and the current user's credential, and reports errors otherwise.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
KERBEROS_CONFIG_CHECK( )
Parameters
This function has no parameters.
Privileges
This function does not require privileges.
Examples
The following example shows the results when the Kerberos configuration is valid.
=> SELECT KERBEROS_CONFIG_CHECK();
kerberos_config_check
-----------------------------------------------------------------------------
ok: krb5 exists at [/etc/krb5.conf]
ok: Vertica Keytab file is set to [/etc/vertica.keytab]
ok: Vertica Keytab file exists at [/etc/vertica.keytab]
[INFO] KerberosCredentialCache [/tmp/vertica_D4/vertica450676899262134963.cc]
Kerberos configuration parameters set in the database
KerberosServiceName : [vertica]
KerberosHostname : [data.hadoop.com]
KerberosRealm : [EXAMPLE.COM]
KerberosKeytabFile : [/etc/vertica.keytab]
Vertica Principal: [vertica/data.hadoop.com@EXAMPLE.COM]
[OK] Vertica can kinit using keytab file
[OK] User [bob] has valid client authentication for kerberos principal [bob@EXAMPLE.COM]]
(1 row)
5.7 - MEMORY_TRIM
Calls glibc function
malloc_trim()
to reclaim free memory from malloc and return it to the operating system. Details on the trim operation are written to system table
MEMORY_EVENTS
.
Unless you turn off memory polling, Vertica automatically detects when glibc accumulates an excessive amount of free memory in its allocation arena. When this occurs, Vertica consolidates much of this memory and returns it to the operating system. Call this function if you disable memory polling and wish to reduce glibc-allocated memory manually.
For more information, see Memory trimming.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
MEMORY_TRIM()
Privileges
Superuser
Examples
=> SELECT memory_trim();
memory_trim
-----------------------------------------------------------------
Pre-RSS: [378822656] Post-RSS: [372129792] Benefit: [0.0176675]
(1 row)
5.8 - PURGE
Permanently removes delete vectors from ROS storage containers so disk space can be reused. PURGE
removes all historical data up to and including the Ancient History Mark epoch.
PURGE
does not delete temporary tables.
Caution
PURGE
can temporarily use significant disk space.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SELECT PURGE()
Privileges
-
Table owner
-
USAGE privilege on schema
Examples
After you delete data from a Vertica table, that data is marked for deletion. To see the data that is marked for deletion, query system table
DELETE_VECTORS
.
Run PURGE
to remove the delete vectors from ROS containers.
=> SELECT * FROM test1;
number
--------
3
12
33
87
43
99
(6 rows)
=> DELETE FROM test1 WHERE number > 50;
OUTPUT
--------
2
(1 row)
=> SELECT * FROM test1;
number
--------
43
3
12
33
(4 rows)
=> SELECT node_name, projection_name, deleted_row_count FROM DELETE_VECTORS;
node_name | projection_name | deleted_row_count
------------------+-----------------+-------------------
v_vmart_node0002 | test1_b1 | 1
v_vmart_node0001 | test1_b1 | 1
v_vmart_node0001 | test1_b0 | 1
v_vmart_node0003 | test1_b0 | 1
(4 rows)
=> SELECT PURGE();
...
(Table: public.test1) (Projection: public.test1_b0)
(Table: public.test1) (Projection: public.test1_b1)
...
(4 rows)
After the ancient history mark (AHM) advances:
=> SELECT * FROM DELETE_VECTORS;
(No rows)
See also
5.9 - RUN_INDEX_TOOL
Runs the Index tool on a Vertica database to perform one of these tasks:
-
Run a per-block cyclic redundancy check (CRC) on data storage to verify data integrity.
-
Check that the sort order in ROS containers is correct.
The function writes summary information about its operation to standard output; detailed information on results is logged in vertica.log
on the current node. For more about evaluating tool output, see:
You can also run the Index tool on a database that is down, from the Linux command line. For details, see CRC and sort order check.
Caution
Use this function only under guidance from Vertica Support.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
RUN_INDEX_TOOL ( 'taskType', global, '[projFilter]' [, numThreads ] );
Parameters
taskType
- Specifies the operation to run, one of the following:
-
checkcrc
: Run a cyclic redundancy check (CRC) on each block of existing data storage to check the data integrity of ROS data blocks. -
checksort
: Evaluate each ROS row to determine whether it is sorted correctly. If ROS data is not sorted correctly in the projection's order, query results that rely on sorted data will be incorrect.
-
global
- Boolean, specifies whether to run the specified task on all nodes (true), or the current one (false).
projFilter
- Specifies the scope of the operation:
-
Empty string (
''
): Run the check on all projections. -
A string that specifies one or more projections as follows:
-
projection-name
: Run the check on this projection -
projection-prefix
*
: Run the check on all projections that begin with the stringprojection-prefix
.
-
-
numThreads
- An unsigned (positive) or signed (negative) integer that specifies the number of threads used to run this operation:
-
n
: Number of threads, ≥ 1 -
-
n
: Negative integer, denotes a fraction of all CPU cores as follows:num-cores / n
Thus,
-1
specifies all cores,-2
, half the cores,-3
, a third of all cores, and so on.Default: 1
-
Privileges
Superuser
Optimizing performance
You can optimize meta-function performance by setting two parameters:
-
projFilter
: Narrows the scope of the operation to one or more projections. -
numThreads
: Specifies the number of threads used to execute the function.
5.10 - SECURITY_CONFIG_CHECK
Returns the status of various security-related parameters. Use this function to verify completeness of your TLS configuration.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SECURITY_CONFIG_CHECK( 'db-component' )
Parameters
db-component
- The component to check. Currently,
NETWORK
is the only supported component.NETWORK
: Returns the status and parameters for spread encryption, internode TLS, and client-server TLS.
Examples
In this example, SECURITY_CONFIG_CHECK shows that spread encryption and data channel TLS are disabled because EncryptSpreadComm is disabled and the data_channel TLS Configuration is not configured.
Similarly, client-server TLS is disabled because the TLS Configuration "server" has a server certificate, but its TLSMODE is disabled. Setting TLSMODE to 'Enable' enables server mode client-server TLS. See TLS protocol for details.
=> SELECT SECURITY_CONFIG_CHECK('NETWORK');
SECURITY_CONFIG_CHECK
----------------------------------------------------------------------------------------------------------------------
Spread security details:
* EncryptSpreadComm = []
Spread encryption is disabled
It is NOT safe to set/change other security config parameters while spread is not encrypted!
Please set EncryptSpreadComm to enable spread encryption first
Data Channel security details:
TLS Configuration 'data_channel' TLSMODE is DISABLE
TLS on the data channel is disabled
Please set EncryptSpreadComm and configure TLS Configuration 'data_channel' to enable TLS on the data channel
Client-Server network security details:
* TLS Configuration 'server' TLSMODE is DISABLE
* TLS Configuration 'server' has a certificate set
Client-Server TLS is disabled
To enable Client-Server TLS set a certificate on TLS Configuration 'server' and/or set the tlsmode to 'ENABLE' or higher
(1 row)
See also
5.11 - SET_CONFIG_PARAMETER
Sets or clears a configuration parameter at the specified level.
Important
You can only use this function to set configuration parameters with string or integer values. To set configuration parameters that accept other data types, use the appropriate ALTER statement.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SET_CONFIG_PARAMETER( 'param-name', { param-value | NULL}, ['level'| NULL])
Arguments
param-name
- Name of the configuration parameter to set.
param-value
- Value to set for
param-name
, either a string or integer. If a string, enclose in single quotes; if an integer, single quotes are optional.To clear
param-name
at the specified level, set to NULL. level
- Level at which to set
param-name
, one of the following string values:-
user
: Current user. -
session
: Current session, overrides the database setting. -
node-name
: Name of database node, overrides session and database settings.
If
level
is omitted or set to NULL,param-name
is set at the database level. -
Note
Some parameters require restart for the value to take effect.Privileges
SuperuserExamples
Set the AnalyzeRowCountInterval parameter to 3600 at the database level:
=> SELECT SET_CONFIG_PARAMETER('AnalyzeRowCountInterval',3600);
SET_CONFIG_PARAMETER
----------------------------
Parameter set successfully
(1 row)
Note
You can achieve the same result with ALTER DATABASE:
ALTER DATABASE DEFAULT SET PARAMETER AnalyzeRowCountInterval = 3600;
Set the MaxSessionUDParameterSize parameter to 2000 at the session level.
=> SELECT SET_CONFIG_PARAMETER('MaxSessionUDParameterSize',2000,'SESSION');
SET_CONFIG_PARAMETER
----------------------------
Parameter set successfully
(1 row)
See also
5.12 - SET_SPREAD_OPTION
Changes spread daemon settings. This function is mainly used to set the timeout before spread assumes a node has gone down.
Note
Changing Spread settings with SET_SPREAD_OPTION has minor impact on your cluster as it pauses while the new settings are propagated across the cluster. Because of this delay, changes to the Spread timeout are not immediately visible in system tableSPREAD_STATE
.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SET_SPREAD_OPTION( option-name, option-value )
Parameters
option-name
- String containing the spread daemon setting to change.
Currently, this function supports only one option:
TokenTimeout
. This setting controls how long spread waits for a node to respond to a message before assuming it is lost. See Adjusting Spread Daemon timeouts for virtual environments for more information. option-value
- The new setting for
option-name
.
Examples
=> SELECT SET_SPREAD_OPTION( 'TokenTimeout', '35000');
NOTICE 9003: Spread has been notified about the change
SET_SPREAD_OPTION
--------------------------------------------------------
Spread option 'TokenTimeout' has been set to '35000'.
(1 row)
=> SELECT * FROM V_MONITOR.SPREAD_STATE;
node_name | token_timeout
------------------+---------------
v_vmart_node0001 | 35000
v_vmart_node0002 | 35000
v_vmart_node0003 | 35000
(3 rows);
See also
5.13 - SHUTDOWN
Shuts down a Vertica database. By default, the shutdown fails if any users are connected. You can check the status of the shutdown operation in the
vertica.log
file.
In Eon Mode, you can call SHUTDOWN_WITH_DRAIN to perform a graceful shutdown that drains client connections and then shuts down the database.
Tip
Before calling SHUTDOWN, you can close all current user connections and prevent further connection attempts as follows:
-
Temporarily set configuration parameter MaxClientSessions to 0.
-
Call CLOSE_ALL_SESSIONS to close all non-dbamin connections.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SHUTDOWN ( [ 'false' | 'true' ] )
Parameters
false
- Default, returns a message if users are connected and aborts the shutdown.
true
- Forces the database to shut down, disallowing further connections.
Privileges
Superuser
Examples
The following command attempts to shut down the database. Because users are connected, the command fails:
=> SELECT SHUTDOWN('false');
NOTICE: Cannot shut down while users are connected
SHUTDOWN
-----------------------------
Shutdown: aborting shutdown
(1 row)
See also
SESSIONS6 - Eon Mode functions
The following functions are meant to be used in Eon Mode.
6.1 - ALTER_LOCATION_SIZE
Eon Mode only
Resizes the depot on one node, all nodes in a subcluster, or all nodes in the database.
Important
Reducing the size of the depot is liable to increase contention over depot usage and require frequent evictions. This behavior can increase the number of queries and load operations that are routed to communal storage for processing, which can incur slower performance and increased access charges.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
ImmutableSyntax
ALTER_LOCATION_SIZE( 'location', '[target]', 'size')
Parameters
location
- Specifies the location to resize, one of the following:
-
depot
: Resizes the node's current depot. -
The depot's absolute path in the Linux filesystem. If you change the depot size on multiple nodes and specify a path, the path must be identical on all affected nodes . By default, this is not the case, as the node's name is typically this path. For example, the default depot path for node 1 in the
verticadb
database is/vertica/data/verticadb/v_verticadb_node0001_depot
.
-
target
- The node or nodes on which to change the depot, one of the following:
-
Node name: Resize the specified node.
-
Subcluster name: Resize depots of all nodes in the specified subcluster.
-
Empty string: Resize all depots in the database.
-
size
Valid only if the storage location usage type is set to
DEPOT
, specifies the maximum amount of disk space that the depot can allocate from the storage location's file system.You can specify
size
in two ways:-
integer
%
: Percentage of storage location disk size. -
integer
{K|M|G|T}
: Amount of storage location disk size in kilobytes, megabytes, gigabytes, or terabytes.
Important
The depot size cannot exceed 80 percent of the file system disk space where the depot is stored. If you specify a value that is too large, Vertica issues a warning and automatically changes the value to 80 percent of the file system size.-
Privileges
SuperuserExamples
Increase depot size on all nodes to 80 percent of file system:
=> SELECT node_name, location_label, location_path, max_size, disk_percent FROM storage_locations WHERE location_usage = 'DEPOT' ORDER BY node_name;
node_name | location_label | location_path | max_size | disk_percent
------------------+-----------------+-------------------------+-------------+--------------
v_vmart_node0001 | auto-data-depot | /home/dbadmin/verticadb | 36060108800 | 70%
v_vmart_node0002 | auto-data-depot | /home/dbadmin/verticadb | 36059377664 | 70%
v_vmart_node0003 | auto-data-depot | /home/dbadmin/verticadb | 36060108800 | 70%
(3 rows)
=> SELECT alter_location_size('depot', '','80%');
alter_location_size
---------------------
depotSize changed.
(1 row)
=> SELECT node_name, location_label, location_path, max_size, disk_percent FROM storage_locations WHERE location_usage = 'DEPOT' ORDER BY node_name;
node_name | location_label | location_path | max_size | disk_percent
------------------+-----------------+-------------------------+-------------+--------------
v_vmart_node0001 | auto-data-depot | /home/dbadmin/verticadb | 41211552768 | 80%
v_vmart_node0002 | auto-data-depot | /home/dbadmin/verticadb | 41210717184 | 80%
v_vmart_node0003 | auto-data-depot | /home/dbadmin/verticadb | 41211552768 | 80%
(3 rows)
Change the depot size to 75% of the filesystem size for all nodes in the analytics subcluster:
=> SELECT subcluster_name, subclusters.node_name, storage_locations.max_size, storage_locations.disk_percent FROM subclusters INNER JOIN storage_locations ON subclusters.node_name = storage_locations.node_name WHERE storage_locations.location_usage='DEPOT';
subcluster_name | node_name | max_size | disk_percent
--------------------+----------------------+----------------------------
default_subcluster | v_verticadb_node0001 | 25264737485 | 60%
default_subcluster | v_verticadb_node0002 | 25264737485 | 60%
default_subcluster | v_verticadb_node0003 | 25264737485 | 60%
analytics | v_verticadb_node0004 | 25264737485 | 60%
analytics | v_verticadb_node0005 | 25264737485 | 60%
analytics | v_verticadb_node0006 | 25264737485 | 60%
analytics | v_verticadb_node0007 | 25264737485 | 60%
analytics | v_verticadb_node0008 | 25264737485 | 60%
analytics | v_verticadb_node0009 | 25264737485 | 60%
(9 rows)
=> SELECT ALTER_LOCATION_SIZE('depot','analytics','75%');
ALTER_LOCATION_SIZE
---------------------
depotSize changed.
(1 row)
=> SELECT subcluster_name, subclusters.node_name, storage_locations.max_size, storage_locations.disk_percent FROM subclusters INNER JOIN storage_locations ON subclusters.node_name = storage_locations.node_name WHERE storage_locations.location_usage='DEPOT';
subcluster_name | node_name | max_size | disk_percent
--------------------+----------------------+----------------------------
default_subcluster | v_verticadb_node0001 | 25264737485 | 60%
default_subcluster | v_verticadb_node0002 | 25264737485 | 60%
default_subcluster | v_verticadb_node0003 | 25264737485 | 60%
analytics | v_verticadb_node0004 | 31580921856 | 75%
analytics | v_verticadb_node0005 | 31580921856 | 75%
analytics | v_verticadb_node0006 | 31580921856 | 75%
analytics | v_verticadb_node0007 | 31580921856 | 75%
analytics | v_verticadb_node0008 | 31580921856 | 75%
analytics | v_verticadb_node0009 | 31580921856 | 75%
(9 rows)
See also
Eon Mode architecture6.2 - BACKGROUND_DEPOT_WARMING
Eon Mode only
Deprecated
Vertica version 10.0.0 removes support for foreground depot warming. When enabled, depot warming always happens in the background. Because foreground depot warming no longer exists, this function serves no purpose and has been deprecated. Calling it has no effect.Forces a node that is warming its depot to start processing queries while continuing to warm its depot in the background. Depot warming only occurs when a node is joining the database and is activating its subscriptions. This function only has an effect if:
-
The database is running in Eon Mode.
-
The node is currently warming its depot.
-
The node is warming its depot from communal storage. This is the case when the UseCommunalStorageForBatchDepotWarming configuration parameter is set to the default value of 1. See Eon Mode parameters for more information about this parameter.
After calling this function, the node warms its depot in the background while taking part in queries.
This function has no effect on a node that is not warming its depot.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
BACKGROUND_DEPOT_WARMING('node-name' [, 'subscription-name'])
Arguments
node-name
- The name of the node that you want to warm its depot in the background.
subscription-name
- The name of a shard that the node subscribes to that you want the node to warm in the background. You can find the names of the shards a node subscribes to in the SHARD_NAME column of the NODE_SUBSCRIPTIONS system table.
Note
When you supply the name of a specific shard subscription to warm in the background, the node may not immediately begin processing queries. It continues to warm any other shard subscriptions in the foreground if they are not yet warm. The node does not begin taking part in queries until it finishes warming the other subscriptions.
Return value
A message indicating that the node's warming will continue in the background.
Privileges
The user must be a superuser .
Examples
The following example demonstrates having node 6 of the verticadb database warm its depot in the background:
=> SELECT BACKGROUND_DEPOT_WARMING('v_verticadb_node0006');
BACKGROUND_DEPOT_WARMING
----------------------------------------------------------------------------
Depot warming running in background. Check monitoring tables for progress.
(1 row)
See also
6.3 - CANCEL_DEPOT_WARMING
Eon Mode only
Cancels depot warming on a node. Depot warming only occurs when a node is joining the database and is activating its subscriptions. You can choose to cancel all warming on the node, or cancel the warming of a specific shard's subscription. The node finishes whatever data transfers it is currently carrying out to warm its depot and removes pending warming-related transfers from its queue. It keeps any data it has already loaded into its depot. If you cancel warming for a specific subscription, it stops warming its depot if all of its other subscriptions are warmed. If they aren't warmed, the node continues to warm those other subscriptions.
This function only has an effect if:
-
The database is running in Eon Mode.
-
The node is currently warming its depot.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CANCEL_DEPOT_WARMING('node-name' [, 'subscription-name'])
Arguments
'
node-name
'
- The name of the node whose depot warming you want canceled.
'
subscription-name
'
- The name of a shard that the node subscribes to that you want the node to stop warming. You can find the names of the shards a node subscribes to in the SHARD_NAME column of the NODE_SUBSCRIPTIONS system table.
Return value
Returns a message indicating warming has been canceled.
Privileges
The user must be a superuser.
Usage considerations
Canceling depot warming can negatively impact the performance of your queries. A node with a cold depot may have to retrieve much of its data from communal storage, which is slower than accessing the depot.
Examples
The following demonstrates canceling the depot warming taking place on node 7:
=> SELECT CANCEL_DEPOT_WARMING('v_verticadb_node0007');
CANCEL_DEPOT_WARMING
--------------------------
Depot warming cancelled.
(1 row)
See also
6.4 - CANCEL_DRAIN_SUBCLUSTER
Eon Mode only
Cancels the draining of a subcluster or subclusters. This function can cancel draining operations that were started by either START_DRAIN_SUBCLUSTER or the draining portion of the SHUTDOWN_WITH_DRAIN function. CANCEL_DRAIN_SUBCLUSTER marks all nodes in the designated subclusters as not draining. The previously draining nodes again accept new client connections and connections redirected from load-balancing.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CANCEL_DRAIN_SUBCLUSTER( 'subcluster-name' )
Arguments
subcluster-name
- Name of the subcluster whose draining operation to cancel. Enter an empty string to cancel the draining operation on all subclusters.
Privileges
Superuser
Examples
The following example demonstrates how to cancel a draining operation on a subcluster.
First, you can query the DRAINING_STATUS system table to view which subclusters are currently draining:
=> SELECT node_name, subcluster_name, is_draining FROM draining_status ORDER BY 1;
node_name | subcluster_name | is_draining
-------------------+--------------------+-------
verticadb_node0001 | default_subcluster | f
verticadb_node0002 | default_subcluster | f
verticadb_node0003 | default_subcluster | f
verticadb_node0004 | analytics | t
verticadb_node0005 | analytics | t
verticadb_node0006 | analytics | t
The following function call cancels the draining of the analytics
subcluster:
=> SELECT CANCEL_DRAIN_SUBCLUSTER('analytics');
CANCEL_DRAIN_SUBCLUSTER
--------------------------------------------------------
Targeted subcluster: 'analytics'
Action: CANCEL DRAIN
(1 row)
To confirm that the subcluster is no longer draining, you can again query the DRAINING_STATUS system table:
=> SELECT node_name, subcluster_name, is_draining FROM draining_status ORDER BY 1;
node_name | subcluster_name | is_draining
-------------------+--------------------+-------
verticadb_node0001 | default_subcluster | f
verticadb_node0002 | default_subcluster | f
verticadb_node0003 | default_subcluster | f
verticadb_node0004 | analytics | f
verticadb_node0005 | analytics | f
verticadb_node0006 | analytics | f
(6 rows)
See also
6.5 - CLEAN_COMMUNAL_STORAGE
Eon Mode only
Marks for deletion invalid data in communal storage, often data that leaked due to an event where Vertica cleanup mechanisms failed. Events that require calling this function include:
-
Node failure
-
Interrupted migration of an Enterprise database to Eon
-
Restoring objects from backup
Tip
It is generally good practice to call CLEAN_COMMUNAL_STORAGE soon after completing an Enterprise-to-Eon migration, and reviving the migrated Eon database.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CLEAN_COMMUNAL_STORAGE ( ['actually-delete'] )
Parameters
actually-delete
- BOOLEAN, specifies whether to queue data files for deletion:
-
true
(default): Add files to the reaper queue and return immediately. The queued files are removed automatically by the reaper service, or can be removed manually by calling FLUSH_REAPER_QUEUE. -
false
: Report information about extra files but do not queue them for deletion.
-
Privileges
Superuser
Examples
=> SELECT CLEAN_COMMUNAL_STORAGE('true')
CLEAN_COMMUNAL_STORAGE
------------------------------------------------------------------
CLEAN COMMUNAL STORAGE
Task was canceled.
Total leaked files: 9265
Total size: 4236501526
Files have been queued for deletion.
Check communal_cleanup_records for more information.
(1 row)
6.6 - CLEAR_DATA_DEPOT
Eon Mode only
Deletes the specified depot data. You can clear depot data of a single table or all tables, from one subcluster, a single node, or the entire database cluster. Clearing depot data can incur extra processing time for any subsequent queries that require that data and must now fetch it from communal storage. Clearing depot data has no effect on communal storage.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CLEAR_DATA_DEPOT( [ '[table-name]' [, '[target-depots]'] ] )
Arguments
To clear all depot data from the database cluster, call this function with no arguments.
table-name
- Name of the table to delete from the target depots. If you omit a table name or supply an empty string, data of all tables is deleted from the target depots.
target-depots
- The depots to clear, one of the following:
subcluster-name
: Name of the depot subcluster,default_subcluster
to specify the default database subcluster.node-name
: Clears depot data from the specified node. Depot data on other nodes in the same subcluster are unaffected.
This argument optionally qualifies the argument for
table-name
. If you omit this argument or supply an empty string, Vertica clears all depot data from the database cluster.
Privileges
Superuser
Examples
Clear the cached data of one table from the specified subcluster depot:
=> SELECT CLEAR_DATA_DEPOT('t1', 'subcluster_1');
clear_data_depot
------------------
Depot cleared
(1 row)
Clear all depot data that is cached on the specified subcluster:.
=> SELECT CLEAR_DATA_DEPOT('', 'subcluster_1');
clear_data_depot
------------------
Depot cleared
(1 row)
Clear all depot data that is cached on the specified node:
=> select clear_data_depot('','v_vmart_node0001');
clear_data_depot
------------------
Depot cleared
(1 row)
Clear all data of the specified table from the depots of all cluster nodes:
=> SELECT CLEAR_DATA_DEPOT('t1');
clear_data_depot
------------------
Depot cleared
(1 row)
Clear all depot data from the database cluster:
=> SELECT CLEAR_DATA_DEPOT();
clear_data_depot
------------------
Depot cleared
(1 row)
See also
Managing depot caching6.7 - CLEAR_DEPOT_ANTI_PIN_POLICY_PARTITION
Eon Mode only
Removes an anti-pinning policy from the specified partition.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CLEAR_DEPOT_ANTI_PIN_POLICY_PARTITION( '[[database.]schema.]object-name', 'min-range-value', 'max-range-value' [, 'subcluster'] )
Arguments
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.object-name
- Table or projection with a partition anti-pinning policy to clear.
min-range-value
,max-range-value
- Range of partition keys in
table
from which to clear an anti-pinning policy, wheremin‑range‑value
must be ≤max‑range‑value
. To specify a single partition key,min‑range‑value
andmax‑range‑value
must be equal. subcluster
Name of a depot subcluster,
default_subcluster
to specify the default database subcluster. If this argument is omitted, all database depots are targeted.
Privileges
Superuser
See also
- SET_DEPOT_ANTI_PIN_POLICY_PARTITION
6.8 - CLEAR_DEPOT_ANTI_PIN_POLICY_PROJECTION
Eon Mode only
Removes an anti-pinning policy from the specified projection.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CLEAR_DEPOT_ANTI_PIN_POLICY_PROJECTION( '[[database.]schema.]projection' [, 'subcluster' ] )
Arguments
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.projection
- Projection with the anti-pinning policy to clear.
subcluster
Name of a depot subcluster,
default_subcluster
to specify the default database subcluster. If this argument is omitted, all database depots are targeted.
Privileges
Superuser
See also
- SET_DEPOT_ANTI_PIN_POLICY_PROJECTION
6.9 - CLEAR_DEPOT_ANTI_PIN_POLICY_TABLE
Eon Mode only
Removes an anti-pinning policy from the specified table.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CLEAR_DEPOT_PIN_POLICY_TABLE( '[[database.]schema.]table' [, 'subcluster' ] )
Arguments
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.table
- Table with the anti-pinning policy to clear.
subcluster
Name of a depot subcluster,
default_subcluster
to specify the default database subcluster. If this argument is omitted, all database depots are targeted.
Privileges
Superuser
See also
- SET_DEPOT_ANTI_PIN_POLICY_TABLE
6.10 - CLEAR_DEPOT_PIN_POLICY_PARTITION
Eon Mode only
Clears a depot pinning policy from the specified table or projection partitions. After the object is unpinned, it can be evicted from the depot by any unpinned or pinned object.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CLEAR_DEPOT_PIN_POLICY_PARTITION( '[[database.]schema.]object-name', 'min-range-value', 'max-range-value' [, subcluster ] )
Arguments
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.object-name
- Table or projection with a partition pinning policy to clear.
min-range-value
,max-range-value
- Range of partition keys in
table
from which to clear a pinning policy, wheremin‑range‑value
must be ≤max‑range‑value
. To specify a single partition key,min‑range‑value
andmax‑range‑value
must be equal. subcluster
Name of a depot subcluster,
default_subcluster
to specify the default database subcluster. If this argument is omitted, all database depots are targeted.
Privileges
Superuser
See also
6.11 - CLEAR_DEPOT_PIN_POLICY_PROJECTION
Eon Mode only
Clears a depot pinning policy from the specified projection. After the object is unpinned, it can be evicted from the depot by any unpinned or pinned object.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CLEAR_DEPOT_PIN_POLICY_PROJECTION( '[[database.]schema.]projection' [, 'subcluster' ] )
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.projection
- Projection with a pinning policy to clear.
subcluster
Name of a depot subcluster,
default_subcluster
to specify the default database subcluster. If this argument is omitted, all database depots are targeted.
Privileges
Superuser
See also
6.12 - CLEAR_DEPOT_PIN_POLICY_TABLE
Eon Mode only
Clears a depot pinning policy from the specified table. After the object is unpinned, it can be evicted from the depot by any unpinned or pinned object.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CLEAR_DEPOT_PIN_POLICY_TABLE( '[[database.]schema.]table' [, 'subcluster' ] )
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.table
- Table with a pinning policy to clear.
subcluster
Name of a depot subcluster,
default_subcluster
to specify the default database subcluster. If this argument is omitted, all database depots are targeted.
Privileges
Superuser
See also
6.13 - CLEAR_FETCH_QUEUE
Eon Mode only
Removes all entries or entries for a specific transaction from the queue of fetch requests of data from the communal storage. You can view the fetch queue by querying the DEPOT_FETCH_QUEUE system table. This function removes all of the queued requests synchronously. It returns after all the fetches have been removed from the queue.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CLEAR_FETCH_QUEUE([transaction_id])
Parameters
*
transaction_id*
- The id of the transaction whose fetches will be cleared from the queue. If this value is not specified, all fetches are removed from the fetch queue.
Examples
This example clears all of the queued fetches for all transactions.
=> SELECT CLEAR_FETCH_QUEUE();
CLEAR_FETCH_QUEUE
--------------------------
Cleared the fetch queue.
(1 row)
This example clears the fetch queue for a specific transaction.
=> SELECT node_name,transaction_id FROM depot_fetch_queue;
node_name | transaction_id
----------------------+-------------------
v_verticadb_node0001 | 45035996273719510
v_verticadb_node0003 | 45035996273719510
v_verticadb_node0002 | 45035996273719510
v_verticadb_node0001 | 45035996273719777
v_verticadb_node0003 | 45035996273719777
v_verticadb_node0002 | 45035996273719777
(6 rows)
=> SELECT clear_fetch_queue(45035996273719510);
clear_fetch_queue
--------------------------
Cleared the fetch queue.
(1 row)
=> SELECT node_name,transaction_id from depot_fetch_queue;
node_name | transaction_id
----------------------+-------------------
v_verticadb_node0001 | 45035996273719777
v_verticadb_node0003 | 45035996273719777
v_verticadb_node0002 | 45035996273719777
(3 rows)
6.14 - DEMOTE_SUBCLUSTER_TO_SECONDARY
Eon Mode only
Converts a primary subcluster to a secondary subcluster.
Vertica will not allow you to demote a primary subcluster if any of the following are true:
-
The subcluster contains a critical node.
-
The subcluster is the only primary subcluster in the database. You must have at least one primary subcluster.
-
The initiator node is a member of the subcluster you are trying to demote. You must call DEMOTE_SUBCLUSTER_TO_SECONDARY from another subcluster.
Important
This function call can take a long time to complete because all the nodes in the subcluster you are promoting or demoting take a global catalog lock, write a checkpoint, and then commit. This global catalog lock can cause other database tasks to fail with errors.
Schedule calls to this function when other database activity is low.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
DEMOTE_SUBCLUSTER_TO_SECONDARY('subcluster-name')
Parameters
subcluster-name
- The name of the primary subcluster to demote to a secondary subcluster.
Privileges
Superuser
Examples
The following example demotes the subcluster analytics_cluster
to a secondary subcluster:
=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
subcluster_name | is_primary
-------------------+------------
analytics_cluster | t
load_subcluster | t
(2 rows)
=> SELECT DEMOTE_SUBCLUSTER_TO_SECONDARY('analytics_cluster');
DEMOTE_SUBCLUSTER_TO_SECONDARY
--------------------------------
DEMOTE SUBCLUSTER TO SECONDARY
(1 row)
=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
subcluster_name | is_primary
-------------------+------------
analytics_cluster | f
load_subcluster | t
(2 rows)
Attempting to demote the subcluster that contains the initiator node results in an error:
=> SELECT node_name FROM sessions WHERE user_name = 'dbadmin'
AND client_type = 'vsql';
node_name
----------------------
v_verticadb_node0004
(1 row)
=> SELECT node_name, is_primary FROM subclusters WHERE subcluster_name = 'analytics';
node_name | is_primary
----------------------+------------
v_verticadb_node0004 | t
v_verticadb_node0005 | t
v_verticadb_node0006 | t
(3 rows)
=> SELECT DEMOTE_SUBCLUSTER_TO_SECONDARY('analytics');
ERROR 9204: Cannot promote or demote subcluster including the initiator node
HINT: Run this command on another subcluster
See also
6.15 - FINISH_FETCHING_FILES
Eon Mode only
Fetches to the depot all files that are queued for download from communal storage.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
FINISH_FETCHING_FILES()
Privileges
Superuser
Examples
Get all files queued for download:
=> SELECT FINISH_FETCHING_FILES();
FINISH_FETCHING_FILES
---------------------------------
Finished fetching all the files
(1 row)
See also
Eon Mode concepts6.16 - FLUSH_REAPER_QUEUE
Eon Mode only
Deletes all data marked for deletion in the database. Use this function to remove all data marked for deletion before the reaper service deletes disk files.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
FLUSH_REAPER_QUEUE( [sync-catalog] )
Parameters
*
sync-catalog*
- Specifies to sync metadata in the database catalog on all nodes before the function executes:
-
true
(default): Sync the database catalog -
false
: Run without syncing.
-
Privileges
Superuser
Examples
Remove all files that are marked for deletion:
=> SELECT FLUSH_REAPER_QUEUE();
FLUSH_REAPER_QUEUE
-----------------------------------------------------
Sync'd catalog and deleted all files in the reaper queue.
(1 row)
See also
CLEAN_COMMUNAL_STORAGE6.17 - MIGRATE_ENTERPRISE_TO_EON
Enterprise Mode only
Migrates an Enterprise database to an Eon Mode database. MIGRATE_ENTERPRISE_TO_EON runs in the foreground; until it returns—either with success or an error—it blocks all operations in the same session on the source Enterprise database. If successful, MIGRATE_ENTERPRISE_TO_EON returns with a list of nodes in the migrated database.
If migration is interrupted before the meta-function returns—for example, the client disconnects, or a network outage occurs—the migration returns an error. In this case, call MIGRATE_ENTERPRISE_TO_EON again to restart migration. For details, see Handling Interrupted Migration.
You can repeat migration multiple times to the same communal storage location—for example, to capture changes that occurred in the source database during the previous migration. For details, see Repeating Migration.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
MIGRATE_ENTERPRISE_TO_EON ( 'communal-storage-location', 'depot-location' [, is-dry-run] )
communal-storage-location
- URI of communal storage location. For URI syntax examples for each supported schema, see File systems and object stores.
depot-location
- Path of Eon depot location, typically:
/vertica/depot
Important
Management Console requires this convention to enable access to depot data and activity.
is-dry-run
- Boolean. If set to true, MIGRATE_ENTERPRISE_TO_EON only checks whether the Enterprise source database complies with all migration prerequisites. If the meta-function discovers any compliance issues, it writes these to the migration error log
migrate_enterprise_to_eon_error.log
in the database directory.Default: false
Privileges
Superuser
Examples
Migrate an Enterprise database to Eon Mode on AWS:
=> SELECT MIGRATE_ENTERPRISE_TO_EON ('s3://verticadbbucket', '/vertica/depot');
migrate_enterprise_to_eon
---------------------------------------------------------------------
v_vmart_node0001,v_vmart_node0002,v_vmart_node0003,v_vmart_node0004
(1 row)
See also
Migrating an enterprise database to Eon Mode6.18 - PROMOTE_SUBCLUSTER_TO_PRIMARY
Eon Mode only
Converts a secondary subcluster to a primary subcluster. You cannot use this function to promote the subcluster that contains the initiator node. You must call it while connected to a node in another subcluster.
Important
This function call can take a long time to complete because all the nodes in the subcluster you are promoting or demoting take a global catalog lock, write a checkpoint, and then commit. This global catalog lock can cause other database tasks to fail with errors.
Schedule calls to this function when other database activity is low.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
PROMOTE_SUBCLUSTER_TO_PRIMARY('subcluster-name')
Parameters
subcluster-name
- The name of the secondary cluster to promote to a primary subcluster.
Privileges
Superuser
Examples
The following example promotes the subcluster named analytics_cluster to a primary cluster:
=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
subcluster_name | is_primary
-------------------+------------
analytics_cluster | f
load_subcluster | t
(2 rows)
=> SELECT PROMOTE_SUBCLUSTER_TO_PRIMARY('analytics_cluster');
PROMOTE_SUBCLUSTER_TO_PRIMARY
-------------------------------
PROMOTE SUBCLUSTER TO PRIMARY
(1 row)
=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
subcluster_name | is_primary
-------------------+------------
analytics_cluster | t
load_subcluster | t
(2 rows)
See also
6.19 - REBALANCE_SHARDS
Eon Mode only
Rebalances shard assignments in a subcluster or across the entire cluster in Eon Mode. If the current session ends, the operation immediately aborts. The amount of time required to rebalance shards scales in a roughly linear fashion based on the number of objects in your database.
Run REBALANCE_SHARDS after you modify your cluster using ALTER NODE or when you add nodes to a subcluster.
Note
Vertica rebalances shards in a subcluster automatically when you:
-
Remove a node from a subcluster.
-
Add a new subcluster with the admintools command
db_add_subcluster
with the-s
option followed by a list of hosts.
After you rebalance shards, you will no longer be able to restore objects from a backup taken before the rebalancing. (Full backups are always possible.) After you rebalance, make another full backup so you will be able to restore objects from it in the future.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
REBALANCE_SHARDS(['subcluster-name'])
Parameters
subcluster-name
- The name of the subcluster where shards will be rebalanced. If you do not supply this parameter, all subclusters in the database rebalance their shards.
Privileges
Superuser
Examples
The following shows that the nodes in the in the newly-added analytics subcluster do not yet have shard subscriptions. It then calls REBALANCE_SHARDS to update the node's subscriptions:
=> SELECT subcluster_name, n.node_name, shard_name, subscription_state FROM
v_catalog.nodes n LEFT JOIN v_catalog.node_subscriptions ns ON (n.node_name
= ns.node_name) ORDER BY 1,2,3;
subcluster_name | node_name | shard_name | subscription_state
----------------------+----------------------+-------------+--------------------
analytics_subcluster | v_verticadb_node0004 | |
analytics_subcluster | v_verticadb_node0005 | |
analytics_subcluster | v_verticadb_node0006 | |
default_subcluster | v_verticadb_node0001 | replica | ACTIVE
default_subcluster | v_verticadb_node0001 | segment0001 | ACTIVE
default_subcluster | v_verticadb_node0001 | segment0003 | ACTIVE
default_subcluster | v_verticadb_node0002 | replica | ACTIVE
default_subcluster | v_verticadb_node0002 | segment0001 | ACTIVE
default_subcluster | v_verticadb_node0002 | segment0002 | ACTIVE
default_subcluster | v_verticadb_node0003 | replica | ACTIVE
default_subcluster | v_verticadb_node0003 | segment0002 | ACTIVE
default_subcluster | v_verticadb_node0003 | segment0003 | ACTIVE
(12 rows)
=> SELECT REBALANCE_SHARDS('analytics_subcluster');
REBALANCE_SHARDS
-------------------
REBALANCED SHARDS
(1 row)
=> SELECT subcluster_name, n.node_name, shard_name, subscription_state FROM
v_catalog.nodes n LEFT JOIN v_catalog.node_subscriptions ns ON (n.node_name
= ns.node_name) ORDER BY 1,2,3;
subcluster_name | node_name | shard_name | subscription_state
----------------------+----------------------+-------------+--------------------
analytics_subcluster | v_verticadb_node0004 | replica | ACTIVE
analytics_subcluster | v_verticadb_node0004 | segment0001 | ACTIVE
analytics_subcluster | v_verticadb_node0004 | segment0003 | ACTIVE
analytics_subcluster | v_verticadb_node0005 | replica | ACTIVE
analytics_subcluster | v_verticadb_node0005 | segment0001 | ACTIVE
analytics_subcluster | v_verticadb_node0005 | segment0002 | ACTIVE
analytics_subcluster | v_verticadb_node0006 | replica | ACTIVE
analytics_subcluster | v_verticadb_node0006 | segment0002 | ACTIVE
analytics_subcluster | v_verticadb_node0006 | segment0003 | ACTIVE
default_subcluster | v_verticadb_node0001 | replica | ACTIVE
default_subcluster | v_verticadb_node0001 | segment0001 | ACTIVE
default_subcluster | v_verticadb_node0001 | segment0003 | ACTIVE
default_subcluster | v_verticadb_node0002 | replica | ACTIVE
default_subcluster | v_verticadb_node0002 | segment0001 | ACTIVE
default_subcluster | v_verticadb_node0002 | segment0002 | ACTIVE
default_subcluster | v_verticadb_node0003 | replica | ACTIVE
default_subcluster | v_verticadb_node0003 | segment0002 | ACTIVE
default_subcluster | v_verticadb_node0003 | segment0003 | ACTIVE
(18 rows)
See also
6.20 - RESHARD_DATABASE
Eon Mode only
Changes the number of shards in a database. This function requires a global catalog lock (GCLX) during runtime. The runtime depends on the size of your catalog.
RESHARD_DATABASE does not immediately affect the storage containers in communal storage. After re-sharding, the new shards still point to the existing containers. If you increase the number of shards in your database, multiple shards will point to the same storage containers. Eventually, the Tuple Mover (TM) mergeout tasks will realign the storage containers with the new shard segmentation bounds. If you want the TM to immediately realign storage containers, call DO_TM_TASK to run a 'RESHARDMERGEOUT' task.
This function does not disrupt most queries. However, the global catalog lock might affect data loads and DDL statements.
Important
RESHARD_DATABASE might be rolled back if you call REBALANCE_SHARDS during runtime. In some cases, rollback is caused by down nodes or nodes that fail during the re-shard process.Syntax
RESHARD_DATABASE(shard-count)
Arguments
shard-count
- A positive integer, the number of shards in the re-sharded database. For information about choosing a suitable
shard-count
, see Choosing the Number of Shards and the Initial Node Count.
Privileges
Superuser
Examples
See Change the number of shards in the database.
See also
6.21 - SANDBOX_SUBCLUSTER
Creates a sandbox for a secondary subcluster.
Note
Vertica recommends using the admintoolssandbox_subcluster
command to create sandboxes. This command includes additional sanity checks and validates that the sandboxed nodes are UP after sandbox creation. However, you must use the SANDBOX_SUBCLUSTER function to add additional subclusters to an existing sandbox.
If sandboxing the first subcluster in a sandbox, the nodes in the specified subcluster create a checkpoint of the catalog at function runtime. When these nodes auto-restart in the sandbox cluster, they form a primary subcluster that uses the data and catalog checkpoint from the main cluster. After the nodes successfully restart, the sandbox cluster and the main cluster are mutually isolated and can diverge.
While the nodes in the main cluster sync their metadata to /path-to-communal-storage/
metadata
/db_name
, the nodes in the sandbox sync to /path-to-communal-storage/
metadata
/sandbox_name
.
You can perform standard database operations and queries, such as loading data or creating new tables, in either cluster without affecting the other cluster. For example, dropping a table in the sandbox cluster does not drop the table in the main cluster, and vice versa.
Because both clusters reference the same data files, neither cluster can delete files that existed at the time of sandbox creation. However, files that are created in the sandbox can be removed. Files in the main cluster can be queued for removal, but they are not processed until all active sandboxes are removed.
You cannot nest sandboxes, but you can have more than one subcluster in a sandbox and multiple sandboxes active at the same time. To add an additional secondary subcluster to an existing sandbox, you must first call SANDBOX_SUBCLUSTER in the sandbox cluster and then in the main cluster. For details, see Adding subclusters to existing sandboxes.
This is a meta-function. You must call-meta-functions in a top-level SELECT statement. The function also requires a global catalog lock (GCLX) during runtime.
Behavior type
VolatileSyntax
SANDBOX_SUBCLUSTER( 'sandbox-name', 'subcluster-name', 'options' )
Arguments
sandbox-name
- Name of the sandbox. The name must conform to the following rules:
-
Consist of at most 30 characters, all of which must have an ASCII code between 36 and 126
-
Begin with a letter
-
Unique among all existing databases and sandboxes
-
subcluster-name
- Name of the secondary subcluster to sandbox. Attempting to sandbox a primary subcluster or a subcluster that is already sandboxed results in an error. The nodes in the subcluster must all have a status of UP and provide full subscription coverage for all shards.
options
- Currently, there are no options for this function.
Privileges
SuperuserExamples
The following example sandboxes the sc02
secondary subcluster into a sandbox named sand
:
=> SELECT SANDBOX_SUBCLUSTER('sand', 'sc_02', '');
SANDBOX_SUBCLUSTER
-----------------------------------------------------------------------------------------------
Subcluster 'sc_02' has been sandboxed to 'sand'. It is going to auto-restart and re-form.
(1 row)
If you query the NODES system table from the main cluster, you can see that the nodes of sc_02
have a status of UNKNOWN and are listed as member of the sand
sandbox:
=> SELECT node_name, subcluster_name, node_state, sandbox FROM NODES;
node_name | subcluster_name | node_state | sandbox
----------------------+--------------------+------------+---------
v_verticadb_node0001 | default_subcluster | UP |
v_verticadb_node0002 | default_subcluster | UP |
v_verticadb_node0003 | default_subcluster | UP |
v_verticadb_node0004 | sc_02 | UNKNOWN | sand
v_verticadb_node0005 | sc_02 | UNKNOWN | sand
v_verticadb_node0006 | sc_02 | UNKNOWN | sand
(6 rows)
When you issue the same query on one of the sandboxed nodes, the table shows that the sandboxed nodes are UP and the nodes from the main cluster are UNKNOWN, confirming that the cluster is successfully sandboxed:
=> SELECT node_name, subcluster_name, node_state, sandbox FROM NODES;
node_name | subcluster_name | node_state | sandbox
----------------------+--------------------+------------+---------
v_verticadb_node0001 | default_subcluster | UNKNOWN |
v_verticadb_node0002 | default_subcluster | UNKNOWN |
v_verticadb_node0003 | default_subcluster | UNKNOWN |
v_verticadb_node0004 | sc_02 | UP | sand
v_verticadb_node0005 | sc_02 | UP | sand
v_verticadb_node0006 | sc_02 | UP | sand
(6 rows)
You can now perform standard database operations in either cluster without impacting the other cluster. For instance, if you create a machine learning dataset named train_data
in the sandboxed subcluster, the new table does not propagate to the main cluster:
--In the sandboxed subcluster
=> CREATE TABLE train_data(time timestamp, Temperature float);
CREATE TABLE
=> COPY train_data FROM LOCAL 'daily-min-temperatures.csv' DELIMITER ',';
Rows Loaded
-------------
3650
(1 row)
=> SELECT * FROM train_data LIMIT 5;
time | Temperature
---------------------+-------------
1981-01-27 00:00:00 | 19.4
1981-02-20 00:00:00 | 15.7
1981-02-27 00:00:00 | 17.5
1981-03-04 00:00:00 | 16
1981-04-24 00:00:00 | 11.5
(5 rows)
--In the main cluster
=> SELECT * FROM train_data LIMIT 5;
ERROR 4566: Relation "train_data" does not exist
See also
6.22 - SET_DEPOT_ANTI_PIN_POLICY_PARTITION
Eon Mode only
Assigns the highest depot eviction priority to a partition. Among other depot-cached objects, objects with an anti-pinning policy are the most susceptible to eviction from the depot. After eviction, the object must be read directly from communal storage the next time it is needed.
If the table has another partition-level eviction policy already set on it, then Vertica combines the policies based on policy type.
If you alter or remove table partitioning, Vertica automatically clears all eviction policies previously set on partitions of that table. The table's eviction policy, if any, is unaffected.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SET_DEPOT_ANTI_PIN_POLICY_PARTITION (
'[[database.]schema.]object-name', 'min-range-value', 'max-range-value' [, 'subcluster' ] )
Arguments
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.object-name
- Target of this policy.
min-range-value
,max-range-value
- Minimum and maximum value of partition keys in
object-name
to anti-pin, wheremin‑range‑value
must be ≤max‑range‑value
. To specify a single partition key,min‑range‑value
andmax‑range‑value
must be equal.If the new policy's partition key range overlaps the range of an existing partition-level eviction policy, Vertica gives precedence to the new policy, as described in Overlapping Policies.
subcluster
Name of a depot subcluster,
default_subcluster
to specify the default database subcluster. If this argument is omitted, all database depots are targeted.
Privileges
Superuser
See also
6.23 - SET_DEPOT_ANTI_PIN_POLICY_PROJECTION
Eon Mode only
Assigns the highest depot eviction priority to a projection. Among other depot-cached objects, objects with an anti-pinning policy are the most susceptible to eviction from the depot. After eviction, the object must be read directly from communal storage the next time it is needed.
If the projection has another eviction policy already set on it, the new policy supersedes it.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SET_DEPOT_ANTI_PIN_POLICY_PROJECTION ( '[[database.]schema.]projection' [, 'subcluster' ] )
Arguments
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.projection
- Target of this policy.
subcluster
Name of a depot subcluster,
default_subcluster
to specify the default database subcluster. If this argument is omitted, all database depots are targeted.
Privileges
Superuser
See also
6.24 - SET_DEPOT_ANTI_PIN_POLICY_TABLE
Eon Mode only
Assigns the highest depot eviction priority to a table. Among other depot-cached objects, objects with an anti-pinning policy are the most susceptible to eviction from the depot. After eviction, the object must be read directly from communal storage the next time it is needed.
If the table has another eviction policy already set on it, the new policy supersedes it.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SET_DEPOT_ANTI_PIN_POLICY_TABLE ( '[[database.]schema.]table' [, 'subcluster' ] )
Arguments
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.table
- Target of this policy.
subcluster
Name of a depot subcluster,
default_subcluster
to specify the default database subcluster. If this argument is omitted, all database depots are targeted.
Privileges
Superuser
See also
6.25 - SET_DEPOT_PIN_POLICY_PARTITION
Eon Mode only
Pins the specified partitions of a table or projection to a subcluster depot, or all database depots, to reduce exposure to depot eviction.
If the table has another partition-level eviction policy already set on it, then Vertica combines the policies based on policy type.
If you alter or remove table partitioning, Vertica automatically clears all eviction policies previously set on partitions of that table. The table's eviction policy, if any, is unaffected.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SET_DEPOT_PIN_POLICY_PARTITION (
'[[database.]schema.]object-name', 'min-range-value', 'max-range-value' [, 'subcluster' ] [, 'download' ] )
Arguments
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.object-name
- Table or projection to pin. If you specify a projection, it must store the partition keys.
min-range-value
,max-range-value
- Minimum and maximum value of partition keys in
object-name
to pin, wheremin‑range‑value
must be ≤max‑range‑value
. To specify a single partition key,min‑range‑value
andmax‑range‑value
must be equal.If the new policy's partition key range overlaps the range of an existing partition-level eviction policy, Vertica gives precedence to the new policy, as described in Overlapping Policies below.
subcluster
Name of a depot subcluster,
default_subcluster
to specify the default database subcluster. If this argument is omitted, all database depots are targeted.download
- Boolean, if true, SET_DEPOT_PIN_POLICY_PARTITION immediately queues the specified partitions for download from communal storage.
Default: false
Privileges
Superuser
Overlapping policies
If a new partition pinning policy overlaps the partition key range of an existing eviction policy, Vertica determines how to apply the policy based on the type of the new and existing policies.
Both policies are pinning policies
If both the new and existing policies are pining policies, then Vertica collates the two ranges. For example, if you create two partition pinning policies with key ranges of 1-3 and 2-10, Vertica creates a single policy with a key range of 1-10.
Partition pinning policy overlaps anti-pinning policy
If the new partition pinning policy overlaps an anti-pinning policy, then Vertica issues a warning and informational message that it reassigned the range of overlapping keys from the anti-pinning policy to the new pinning policy.
For example, if you create an anti-partition pinning policy and then a pinning policy with key ranges of 1-10 and 5-20, respectively, Vertica truncates the earlier anti-pinning policy's key range:
policy_type | min_value | max_value |
---|---|---|
PIN | 5 | 20 |
ANTI_PIN | 1 | 4 |
If the new pinning policy's partition range falls inside the range of an older anti-pinning policy, Vertica splits the anti-pinning policy. So, given an existing partition anti-pinning policy with a key range of 1-20, a new partition pinning policy with a key range of 5-10 splits the anti-pinning policy:
policy_type | min_value | max_value |
---|---|---|
ANTI_PIN | 1 | 4 |
PIN | 5 | 10 |
ANTI_PIN | 11 | 20 |
Precedence of pinning policies
In general, partition management functions that involve two partitioned tables give precedence to the target table's pinning policy, as follows:
-
COPY_PARTITIONS_TO_TABLE: Partition-level pinning is reliable if the source and target tables have pinning policies on the same partition keys. If the two tables have different pinning policies, then the partition pinning policies of the target table apply.
-
MOVE_PARTITIONS_TO_TABLE: Partition-level pinning policies of the target table apply.
-
SWAP_PARTITIONS_BETWEEN_TABLES: Partition-level pinning policies of the target table apply.
For example, the following statement copies partitions from table t1
to table t2
:
=> SELECT COPY_PARTITIONS_TO_TABLE('t1', '1', '5', 't2');
In this case, the following logic applies:
-
If the two tables have different partition pinning policies, then the pinning policy of target table
t2
for partition keys 1-5 applies. -
If table
t2
does not exist, then Vertica creates it from tablet1
, and copiest1
's policy on partition keys 1-5. Subsequently, if you clear the partition pinning policy from either table, it is also cleared from the other.
See also
6.26 - SET_DEPOT_PIN_POLICY_PROJECTION
Eon Mode only
Pins a projection to a subcluster depot, or all database depots, to reduce its exposure to depot eviction. For details on pinning policies and usage guidelines, see Pinning Policies.
If the projection has another eviction policy already set on it, the new policy supersedes it.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SET_DEPOT_PIN_POLICY_PROJECTION ( '[[database.]schema.]projection' [, 'subcluster' ] [, download ] )
Arguments
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.projection
- Projection to pin.
subcluster
Name of a depot subcluster,
default_subcluster
to specify the default database subcluster. If this argument is omitted, all database depots are targeted.download
- Boolean, if true SET_DEPOT_PIN_POLICY_PROJECTION immediately queues the specified projection for download from communal storage.
Default: false
Privileges
Superuser
See also
6.27 - SET_DEPOT_PIN_POLICY_TABLE
Eon Mode only
Pins a table to a subcluster depot, or all database depots, to reduce its exposure to depot eviction.
If the table has another eviction policy already set on it, the new policy supersedes it. After you pin a table to a subcluster depot, you cannot subsequently pin any of its partitions and projections in that depot.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SET_DEPOT_PIN_POLICY_TABLE ( '[[database.]schema.]table' [, 'subcluster' ] [, download ] )
Arguments
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.table
- Table to pin.
subcluster
Name of a depot subcluster,
default_subcluster
to specify the default database subcluster. If this argument is omitted, all database depots are targeted.download
- Boolean, if true, SET_DEPOT_PIN_POLICY_TABLE immediately queues the specified table for download from communal storage.
Default: false
Privileges
Superuser
See also
6.28 - SHUTDOWN_SUBCLUSTER
Eon Mode only
Shuts down a subcluster. This function shuts down the subcluster synchronously, returning when shutdown is complete with the message Subcluster shutdown. If the subcluster is already down, the function returns with no error.
Stopping a subcluster does not warn you if there are active user sessions connected to the subcluster. This behavior is the same as stopping an individual node. Before stopping a subcluster, verify that no users are connected to it.
If you want to drain client connections before shutting down a subcluster, you can gracefully shutdown the subcluster using SHUTDOWN_WITH_DRAIN.
Caution
This function does not test whether the target subcluster is critical (a subcluster whose loss would cause the database to shut down). Using this function to shut down a critical subcluster results in the database shutting down. Always verify that the subcluster you want to shut down is not critical by querying the CRITICAL_SUBCLUSTERS system table before calling this function.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SHUTDOWN_SUBCLUSTER('subcluster-name')
Arguments
subcluster-name
- Name of the subcluster to shut down.
Privileges
Superuser
Examples
The following example demonstrates shutting down the subcluster analytics
:
=> SELECT subcluster_name, node_name, node_state FROM nodes order by 1,2;
subcluster_name | node_name | node_state
--------------------+----------------------+------------
analytics | v_verticadb_node0004 | UP
analytics | v_verticadb_node0005 | UP
analytics | v_verticadb_node0006 | UP
default_subcluster | v_verticadb_node0001 | UP
default_subcluster | v_verticadb_node0002 | UP
default_subcluster | v_verticadb_node0003 | UP
(6 rows)
=> SELECT SHUTDOWN_SUBCLUSTER('analytics');
WARNING 4539: Received no response from v_verticadb_node0004 in stop subcluster
WARNING 4539: Received no response from v_verticadb_node0005 in stop subcluster
WARNING 4539: Received no response from v_verticadb_node0006 in stop subcluster
SHUTDOWN_SUBCLUSTER
---------------------
Subcluster shutdown
(1 row)
=> SELECT subcluster_name, node_name, node_state FROM nodes order by 1,2;
subcluster_name | node_name | node_state
--------------------+----------------------+------------
analytics | v_verticadb_node0004 | DOWN
analytics | v_verticadb_node0005 | DOWN
analytics | v_verticadb_node0006 | DOWN
default_subcluster | v_verticadb_node0001 | UP
default_subcluster | v_verticadb_node0002 | UP
default_subcluster | v_verticadb_node0003 | UP
(6 rows)
Note
The "WARNING 4539" messages after calling SHUTDOWN_SUBCLUSTER occur because the nodes are in the process of shutting down. They are expected.See also
6.29 - SHUTDOWN_WITH_DRAIN
Eon Mode only
Gracefully shuts down a subcluster or subclusters. The function drains client connections on the subcluster's nodes and then shuts down the subcluster. This is synchronous function that returns when the shutdown message has been sent to the subcluster.
Work from existing user sessions continues on draining nodes, but the nodes refuse new client connections and are excluded from load-balancing operations. dbadmin can still connect to draining nodes.
The nodes drain until either the existing connections complete their work and close or the user-specified timeout is reached. When one of these conditions is met, the function proceeds to shut down the subcluster.
For more information about the graceful shutdown process, see Graceful Shutdown.
Caution
This function does not test whether the target subcluster is critical (a subcluster whose loss would cause the database to shut down). Using this function to shut down a critical subcluster results in the database shutting down. Always verify that the subcluster you want to shut down is not critical by querying the CRITICAL_SUBCLUSTERS system table before calling this function.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SHUTDOWN_WITH_DRAIN( 'subcluster-name', timeout-seconds )
Arguments
subcluster-name
- Name of the subcluster to shutdown. Enter an empty string to shutdown all subclusters in a database.
timeout-seconds
- Number of seconds to wait before forcefully closing
subcluster-name
's client connections and shutting down. The behavior depends on the sign oftimeout-seconds
:- Positive integer: The function waits until either the runtime reaches
timeout-seconds
or the client connections finish their work and close. As soon as one of these conditions is met, the function immediately proceeds to shut down the subcluster. - Zero: The function immediately closes any open client connections and shuts down the subcluster.
- Negative integer: The function marks the subcluster as draining and waits indefinitely to shut down the subcluster until all active user sessions disconnect.
- Positive integer: The function waits until either the runtime reaches
Privileges
Superuser
Examples
In the following example, the function marks the subcluster named analytics as draining and then shuts it down as soon as either the existing client connections close or 300 seconds pass:
=> SELECT SHUTDOWN_WITH_DRAIN('analytics', 120);
NOTICE 0: Draining has started on subcluster (analytics)
NOTICE 0: Begin shutdown of subcluster (analytics)
SHUTDOWN_WITH_DRAIN
--------------------------------------------------------------------------------------------------------------------
Set subcluster (analytics) to draining state
Waited for 3 nodes to drain
Shutdown message sent to subcluster (analytics)
(1 row)
You can query the DC_DRAINING_EVENTS table to see more information about draining and shutdown events, such as whether any user sessions were forcibly closed. This subcluster had one active user session when the shutdown began, but it closed before the timeout was reached:
=> SELECT event_type, event_type_name, event_description, event_result, event_result_name FROM dc_draining_events;
event_type | event_type_name | event_description | event_result | event_result_name
------------+------------------------------+---------------------------------------------------------------------+--------------+-------------------
0 | START_DRAIN_SUBCLUSTER | START_DRAIN for SHUTDOWN of subcluster (analytics) | 0 | SUCCESS
2 | START_WAIT_FOR_NODE_DRAIN | Wait timeout is 120 seconds | 4 | INFORMATIONAL
4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 0 seconds | 4 | INFORMATIONAL
4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 30 seconds | 4 | INFORMATIONAL
3 | END_WAIT_FOR_NODE_DRAIN | Wait for drain ended with 0 sessions remaining | 0 | SUCCESS
5 | BEGIN_SHUTDOWN_AFTER_DRAIN | Starting shutdown of subcluster (analytics) following drain | 4 | INFORMATIONAL
(6 rows)
See also
6.30 - START_DRAIN_SUBCLUSTER
Eon Mode only
Drains a subcluster or subclusters. The function marks all nodes in the designated subcluster as draining. Work from existing user sessions continues on draining nodes, but the nodes refuse new client connections and are excluded from load balancing operations. dbadmin can still connect to draining nodes.
To drain connections on a subcluster as part of a graceful shutdown process, you can call SHUTDOWN_WITH_DRAIN. For details, see Graceful Shutdown.
To cancel a draining operation on a subcluster, call CANCEL_DRAIN_SUBCLUSTER. If all draining nodes in a subcluster are stopped, they are marked as not draining upon restart.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
START_DRAIN_SUBCLUSTER( 'subcluster-name' )
Arguments
subcluster-name
- Name of the subcluster to drain. Enter an empty string to drain all subclusters in the database.
Privileges
Superuser
Examples
The following example demonstrates how to drain a subcluster named analytics:
=> SELECT subcluster_name, node_name, node_state FROM nodes;
subcluster_name | node_name | node_state
-------------------+--------------------+------------
default_subcluster | verticadb_node0001 | UP
default_subcluster | verticadb_node0002 | UP
default_subcluster | verticadb_node0003 | UP
analytics | verticadb_node0004 | UP
analytics | verticadb_node0005 | UP
analytics | verticadb_node0006 | UP
(6 rows)
=> SELECT START_DRAIN_SUBCLUSTER('analytics');
START_DRAIN_SUBCLUSTER
-------------------------------------------------------
Targeted subcluster: 'analytics'
Action: START DRAIN
(1 row)
You can confirm that the subcluster is draining by querying the DRAINING_STATUS system table:
=> SELECT node_name, subcluster_name, is_draining FROM draining_status ORDER BY 1;
node_name | subcluster_name | is_draining
-------------------+--------------------+-------
verticadb_node0001 | default_subcluster | f
verticadb_node0002 | default_subcluster | f
verticadb_node0003 | default_subcluster | f
verticadb_node0004 | analytics | t
verticadb_node0005 | analytics | t
verticadb_node0006 | analytics | t
See also
6.31 - START_REAPING_FILES
Eon Mode only
Starts the disk file deletion in the background as an asynchronous function. By default, this meta-function syncs the catalog before beginning deletion. Disk file deletion is handled in the foreground by FLUSH_REAPER_QUEUE.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
START_REAPING_FILES( [sync-catalog] )
Parameters
*
sync-catalog*
- Specifies to sync metadata in the database catalog on all nodes before the function executes:
-
true
(default): Sync the database catalog -
false
: Run without syncing.
-
Privileges
Superuser
Examples
Start the reaper service:
=> SELECT START_REAPING_FILES();
Start the reaper service and skip the initial catalog sync:
=> SELECT START_REAPING_FILES(false);
6.32 - SYNC_CATALOG
Eon Mode only
Synchronizes the catalog to communal storage to enable reviving the current catalog version in the case of an imminent crash. Vertica synchronizes all pending checkpoint and transaction logs to communal storage.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SYNC_CATALOG( [ 'node-name' ] )
Parameters
node-name
- The node to synchronize. If you omit this argument, Vertica synchronizes the catalog on all nodes.
Privileges
Superuser
Examples
Synchronize the catalog on all nodes:
=> SELECT SYNC_CATALOG();
Synchronize the catalog on one node:
=> SELECT SYNC_CATALOG( 'node001' );
6.33 - UNSANDBOX_SUBCLUSTER
Removes a subcluster from a sandbox.
Note
Vertica recommends using the admintoolsunsandbox_subcluster
command to remove the sandbox's primary subcluster. This command automatically stops the sandboxed nodes, wipes the node's catalog subdirectories, and restarts the nodes. If you use the UNSANDBOX_SUBCLUSTER function, these steps must be completed manually.
After stopping the nodes in the sandboxed subcluster, you must run this function in the main cluster from which the sandboxed subcluster was spun-off. The function changes the metadata in the main cluster that designates the specified subcluster as sandboxed, but does not restart the subcluster and rejoin it to the main cluster.
If you are unsandboxing a secondary subcluster from the sandbox, Vertica recommends that you also call the UNSANDBOX_SUBCLUSTER function in the sandbox cluster. This makes sure that both clusters are aware of the state of the subcluster and that relevant system tables accurately reflect the subcluster's status.
To rejoin the subcluster to the main cluster and return the nodes to their normal state, you must complete the following tasks:
-
Wipe the catalog subdirectory from the sandboxed nodes. The main cluster provides the current catalog information on node restart.
-
Restart the nodes. On successful restart, the nodes should rejoin the main cluster.
-
If unsandboxing the last subcluster in a sandbox, remove the sandbox metadata prefix from the shared communal storage location. This helps avoid problems that might arise form reusing the same sandbox name.
Note
If you upgraded the Vertica version of the sandboxed subcluster, you must downgrade the version of the subcluster before rejoining it to the main cluster.If there are no more active sandboxes, you can run CLEAN_COMMUNAL_STORAGE to remove any data created in the sandbox. The main cluster can also resume processing data queued for deletion.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
UNSANDBOX_SUBCLUSTER( 'subcluster-name', 'options' )
Arguments
subcluster-name
- Identifies the subcluster to unsandbox. This must be a currently sandboxed subcluster.
options
- Currently, there are no options for this function.
Privileges
SuperuserExamples
In the following example, the function unsandboxes the sc_02
secondary subcluster from the sand
sandbox. After stopping the nodes in the subcluster, you can unsandbox the subcluster by calling the UNSANDBOX_SUBCLUSTER function from the main cluster:
=> SELECT UNSANDBOX_SUBCLUSTER('sc_02', '');
UNSANDBOX_SUBCLUSTER
---------------------------------------------------------------------------------------------------------------
Subcluster 'sc_02' has been unsandboxed. If wiped out and restarted, it should be able to rejoin the cluster.
(1 row)
To rejoin the nodes to the main cluster, you must wipe the local catalog from each of the previously sandboxed nodes—whose catalog location can be found by querying NODES—and then restart the nodes:
$ rm -rf paths-to-node-catalogs
$ admintools -t restart_node -s list-of-nodes -p password
After the nodes restart, you can query the NODES system table to confirm that the previously sandboxed nodes are UP and are no longer a member of sand
:
=> SELECT node_name, subcluster_name, node_state, sandbox FROM NODES;
node_name | subcluster_name | node_state | sandbox
----------------------+--------------------+------------+---------
v_verticadb_node0001 | default_subcluster | UP |
v_verticadb_node0002 | default_subcluster | UP |
v_verticadb_node0003 | default_subcluster | UP |
v_verticadb_node0004 | sc_01 | UNKNOWN | sand
v_verticadb_node0005 | sc_01 | UNKNOWN | sand
v_verticadb_node0006 | sc_01 | UNKNOWN | sand
v_verticadb_node0007 | sc_02 | UP |
v_verticadb_node0008 | sc_02 | UP |
v_verticadb_node0009 | sc_02 | UP |
(9 rows)
Because sc_02
was a secondary subcluster in the sandbox, you should also call the UNSANDBOX_SUBCLUSTER function in the sandbox cluster. This makes sure that both clusters are aware of the state of the subcluster and that relevant system tables accurately reflect the subcluster's status:
=> SELECT UNSANDBOX_SUBCLUSTER('sc_02', '');
UNSANDBOX_SUBCLUSTER
-------------------------------------------------------------------------------------------------------------------
Subcluster 'sc_02' has been unsandboxed from 'sand'. This command should be executed in the main cluster as well.
(1 row)
If there are no more active sandboxes, you can run the CLEAN_COMMUNAL_STORAGE function to remove any data created in the sandbox. You should also remove the sandbox's metadata from the shared communal storage location, which can be found at /path-to-communal-storage/
metadata
/sandbox_name
.
The following example removes the sandbox's metadata from an S3 bucket and then calls CLEAN_COMMUNAL_STORAGE to cleanup any data from the sandbox:
$ aws s3 rm /path-to-communal/metadata/sandbox_name
=> SELECT CLEAN_COMMUNAL_STORAGE('true');
CLEAN_COMMUNAL_STORAGE
-----------------------------------------------------------------
CLEAN COMMUNAL STORAGE
Total leaked files: 143
Files have been queued for deletion.
Check communal_cleanup_records for more information.
(1 row)
See also
7 - Epoch functions
This section contains the epoch management functions specific to Vertica.
7.1 - ADVANCE_EPOCH
Manually closes the current epoch and begins a new epoch.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
ADVANCE_EPOCH ( [ integer ] )
Parameters
integer
- Specifies the number of epochs to advance.
Privileges
Superuser
Notes
This function is primarily maintained for backward compatibility with earlier versions of Vertica.
Examples
The following command increments the epoch number by 1:
=> SELECT ADVANCE_EPOCH(1);
7.2 - GET_AHM_EPOCH
Returns the number of the epoch in which the Ancient History Mark is located. Data deleted up to and including the AHM epoch can be purged from physical storage.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
GET_AHM_EPOCH()
Note
The AHM epoch is 0 (zero) by default (purge is disabled).Privileges
None
Examples
=> SELECT GET_AHM_EPOCH();
GET_AHM_EPOCH
----------------------
Current AHM epoch: 0
(1 row)
7.3 - GET_AHM_TIME
Returns a TIMESTAMP value representing the Ancient History Mark. Data deleted up to and including the AHM epoch can be purged from physical storage.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
GET_AHM_TIME()
Privileges
None
Examples
=> SELECT GET_AHM_TIME();
GET_AHM_TIME
-------------------------------------------------
Current AHM Time: 2010-05-13 12:48:10.532332-04
(1 row)
7.4 - GET_CURRENT_EPOCH
The epoch into which data (COPY, INSERT, UPDATE, and DELETE operations) is currently being written.
Returns the number of the current epoch.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
GET_CURRENT_EPOCH()
Privileges
None
Examples
=> SELECT GET_CURRENT_EPOCH();
GET_CURRENT_EPOCH
-------------------
683
(1 row)
7.5 - GET_LAST_GOOD_EPOCH
Returns the last good epoch number. If the database has no projections, the function returns an error.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
GET_LAST_GOOD_EPOCH()
Privileges
None
Examples
=> SELECT GET_LAST_GOOD_EPOCH();
GET_LAST_GOOD_EPOCH
---------------------
682
(1 row)
7.6 - MAKE_AHM_NOW
Sets the Ancient History Mark (AHM) to the greatest allowable value. This lets you purge all deleted data.
Caution
After running this function, you cannot query historical data that precedes the current epoch. Only database administrators should use this function.MAKE_AHM_NOW
performs the following operations:
-
Advances the epoch.
-
Sets the AHM to the last good epoch (LGE) — at least to the epoch that is current when you execute
MAKE_AHM_NOW
.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
MAKE_AHM_NOW ( [ true ] )
Parameters
true
- Allows AHM to advance when one of the following conditions is true:
-
One or more nodes are down.
-
One projection is being refreshed from another (retentive refresh).
In both cases , you must supply this argument to
MAKE_AHM_NOW
, otherwise Vertica returns an error. If you executeMAKE_AHM_NOW(true)
during retentive refresh, Vertica rolls back the refresh operation and advances the AHM.Caution
If the function advances AHM beyond the last good epoch of the down nodes, those nodes must recover all data from scratch. -
Privileges
Superuser
Setting AHM when nodes are down
If any node in the cluster is down, you must call MAKE_AHM_NOW
with an argument of true; otherwise, the function returns an error.
Note
This requirement applies only to Enterprise mode; in Eon mode, it is ignored.In the following example, MAKE_AHM_NOW
advances the AHM even though a node is down:
=> SELECT MAKE_AHM_NOW(true);
WARNING: Received no response from v_vmartdb_node0002 in get cluster LGE
WARNING: Received no response from v_vmartdb_node0002 in get cluster LGE
WARNING: Received no response from v_vmartdb_node0002 in set AHM
MAKE_AHM_NOW
------------------------------
AHM set (New AHM Epoch: 684)
(1 row)
See also
7.7 - SET_AHM_EPOCH
Sets the Ancient History Mark (AHM) to the specified epoch. This function allows deleted data up to and including the AHM epoch to be purged from physical storage.
SET_AHM_EPOCH
is normally used for testing purposes. Instead, consider using
SET_AHM_TIME
which is easier to use.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SET_AHM_EPOCH ( epoch, [ true ] )
Parameters
epoch
- Specifies one of the following:
-
The number of the epoch in which to set the AHM
-
Zero (0) (the default) disables PURGE
Important
The number of the specified epoch must be:
-
Greater than the current AHM epoch
-
Less than the current epoch
Query the SYSTEM table to view current epoch values relative to the AHM.
-
true
- Allows the AHM to advance when nodes are down.
Caution
If you advance AHM beyond the last good epoch of the down nodes, those nodes must recover all data from scratch.
Privileges
Superuser
Setting AHM when nodes are down
If any node in the cluster is down, you must call SET_AHM_EPOCH
with an argument of true; otherwise, the function returns an error.
Note
This requirement applies only to Enterprise mode; in Eon mode, it is ignored.Examples
The following command sets the AHM to a specified epoch of 12:
=> SELECT SET_AHM_EPOCH(12);
The following command sets the AHM to a specified epoch of 2 and allows the AHM to advance despite a failed node:
=> SELECT SET_AHM_EPOCH(2, true);
See also
7.8 - SET_AHM_TIME
Sets the Ancient History Mark (AHM) to the epoch corresponding to the specified time on the initiator node. This function allows historical data up to and including the AHM epoch to be purged from physical storage. SET_AHM_TIME
returns a TIMESTAMPTZ that represents the end point of the AHM epoch.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SET_AHM_TIME ( time, [ true ] )
Parameters
time
- A TIMESTAMP/TIMESTAMPTZ value that is automatically converted to the appropriate epoch number.
true
- Allows the AHM to advance when nodes are down.
Caution
If you advance AHM beyond the last good epoch of the down nodes, those nodes must recover all data from scratch.
Privileges
Superuser
Setting AHM when nodes are down
If any node in the cluster is down, you must call SET_AHM_TIME
with an argument of true; otherwise, the function returns an error.
Note
This requirement applies only to Enterprise mode; in Eon mode, it is ignored.Examples
Epochs depend on a configured epoch advancement interval. If an epoch includes a three-minute range of time, the purge operation is accurate only to within minus three minutes of the specified timestamp:
=> SELECT SET_AHM_TIME('2008-02-27 18:13');
set_ahm_time
------------------------------------
AHM set to '2008-02-27 18:11:50-05'
(1 row)
Note
The –05 part of the output string is a time zone value, an offset in hours from UTC (Universal Coordinated Time, traditionally known as Greenwich Mean Time, or GMT).In the previous example, the actual AHM epoch ends at 18:11:50, roughly one minute before the specified timestamp. This is because SET_AHM_TIME selects the epoch that ends at or before the specified timestamp. It does not select the epoch that ends after the specified timestamp because that would purge data deleted as much as three minutes after the AHM.
For example, using only hours and minutes, suppose that epoch 9000 runs from 08:50 to 11:50 and epoch 9001 runs from 11:50 to 15:50. SET_AHM_TIME('11:51')
chooses epoch 9000 because it ends roughly one minute before the specified timestamp.
In the next example, suppose that a node went down at 11:00:00 AM on January 1st 2017. At noon, you want to advance the AHM to 11:15:00, but the node is still down.
Suppose you try to set the AHM using this command:
=> SELECT SET_AHM_TIME('2017-01-01 11:15:00');
Then you will receive an error message. Vertica prevents you from moving the AHM past the point where a node went down. Vertica returns this error to prevent the AHM from advancing past the down node's last good epoch. You can force the AHM to advance by supplying the optional second parameter:
=> SELECT SET_AHM_TIME('2017-01-01 11:15:00', true);
However, if you force the AHM past the last good epoch, the failed node will have to recover from scratch.
See also
8 - LDAP link functions
This section contains the functions associated with the Vertica LDAP Link service.
8.1 - LDAP_LINK_DRYRUN_CONNECT
Takes a set of LDAP Link connection parameters as arguments and begins a dry run connection between the LDAP server and Vertica.
By providing an empty string for the LDAPLinkBindPswd
argument, you can also perform an anonymous bind if your LDAP server allows unauthenticated binds.
The dryrun and LDAP_LINK_SYNC_START functions must be run from the clerk node. To determine the clerk node, query NODE_RESOURCES:
=> SELECT node_name, dbclerk FROM node_resources WHERE dbclerk='t';
node_name | dbclerk
------------------+---------
v_vmart_node0001 | t
(1 row)
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
LDAP_LINK_DRYRUN_CONNECT (
'LDAPLinkURL',
'LDAPLinkBindDN',
'LDAPLinkBindPswd'
)
Privileges
Superuser
Examples
This tests the connection to an LDAP server at ldap://example.dc.com
with the DN CN=amir,OU=QA,DC=dc,DC=com
.
=> SELECT LDAP_LINK_DRYRUN_CONNECT('ldap://example.dc.com','CN=amir,OU=QA,DC=dc,DC=com','password');
ldap_link_dryrun_connect
---------------------------------------------------------------------------------
Dry Run Connect Completed. Query v_monitor.ldap_link_dryrun_events for results.
To check the results of the bind, query the system table LDAP_LINK_DRYRUN_EVENTS.
=> SELECT event_timestamp, event_type, entry_name, role_name, link_scope, search_base from LDAP_LINK_DRYRUN_EVENTS;
event_timestamp | event_type | entry_name | link_scope | search_base
------------------------------+-----------------------+----------------------+------------+-------------
2019-12-09 15:41:43.589398-05 | BIND_STARTED | -------------------- | ---------- | -----------
2019-12-09 15:41:43.590504-05 | BIND_FINISHED | -------------------- | ---------- | -----------
See also
8.2 - LDAP_LINK_DRYRUN_SEARCH
Takes a set of LDAP Link connection and search parameters as arguments and begins a dry run search for users and groups that would get imported from the LDAP server.
By providing an empty string for the LDAPLinkBindPswd
argument, you can also perform an anonymous search if your LDAP server's Access Control List (ACL) is configured to allow unauthenticated searches. The settings for allowing anonymous binds are different from the ACL settings for allowing anonymous searches.
The dryrun and LDAP_LINK_SYNC_START functions must be run from the clerk node. To determine the clerk node, query NODE_RESOURCES:
=> SELECT node_name, dbclerk FROM node_resources WHERE dbclerk='t';
node_name | dbclerk
------------------+---------
v_vmart_node0001 | t
(1 row)
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
LDAP_LINK_DRYRUN_SEARCH (
'LDAPLinkURL',
'LDAPLinkBindDN',
'LDAPLinkBindPswd',
'LDAPLinkSearchBase',
'LDAPLinkScope',
'LDAPLinkFilterUser',
'LDAPLinkFilterGroup',
'LDAPLinkUserName',
'LDAPLinkGroupName',
'LDAPLinkGroupMembers',
[LDAPLinkSearchTimeout],
['LDAPLinkJoinAttr']
)
Privileges
Superuser
Examples
This searches for users and groups in the LDAP server. In this case, the LDAPLinkSearchBase
parameter specifies the dc.com
domain and a sub scope, which replicates the entire subtree under the DN.
To further filter results, the function checks for users and groups with the person
and group
objectClass attributes. It then searches the group attribute cn
, identifying members of that group with the member
attribute, and then identifying those individual users with the attribute uid
.
=> SELECT LDAP_LINK_DRYRUN_SEARCH('ldap://example.dc.com','CN=amir,OU=QA,DC=dc,DC=com','$vertica$','dc=DC,dc=com','sub',
'(objectClass=person)','(objectClass=group)','uid','cn','member',10,'dn');
ldap_link_dryrun_search
--------------------------------------------------------------------------------
Dry Run Search Completed. Query v_monitor.ldap_link_dryrun_events for results.
To check the results of the search, query the system table LDAP_LINK_DRYRUN_EVENTS.
=> SELECT event_timestamp, event_type, entry_name, ldapurihash, link_scope, search_base from LDAP_LINK_DRYRUN_EVENTS;
event_timestamp | event_type | entry_name | ldapurihash | link_scope | search_base
---------------------------------+------------------+------------------------+-------------+------------+--------------
2020-01-03 21:03:26.411753+05:30 | BIND_STARTED | ---------------------- | 0 | sub | dc=DC,dc=com
2020-01-03 21:03:26.422188+05:30 | BIND_FINISHED | ---------------------- | 0 | sub | dc=DC,dc=com
2020-01-03 21:03:26.422223+05:30 | SYNC_STARTED | ---------------------- | 0 | sub | dc=DC,dc=com
2020-01-03 21:03:26.422229+05:30 | SEARCH_STARTED | ********** | 0 | sub | dc=DC,dc=com
2020-01-03 21:03:32.043107+05:30 | LDAP_GROUP_FOUND | Account Operators | 0 | sub | dc=DC,dc=com
2020-01-03 21:03:32.04312+05:30 | LDAP_GROUP_FOUND | Administrators | 0 | sub | dc=DC,dc=com
2020-01-03 21:03:32.043182+05:30 | LDAP_USER_FOUND | user1 | 0 | sub | dc=DC,dc=com
2020-01-03 21:03:32.043186+05:30 | LDAP_USER_FOUND | user2 | 0 | sub | dc=DC,dc=com
2020-01-03 21:03:32.04319+05:30 | SEARCH_FINISHED | ********** | 0 | sub | dc=DC,dc=com
See also
8.3 - LDAP_LINK_DRYRUN_SYNC
Takes a set of LDAP Link connection and search parameters as arguments and begins a dry run synchronization between the database and the LDAP server, which maps and synchronizes the LDAP server's users and groups with their equivalents in Vertica. This meta-function also dry runs the creation and orphaning of users and roles in Vertica.
The dryrun and LDAP_LINK_SYNC_START functions must be run from the clerk node. To determine the clerk node, query NODE_RESOURCES:
=> SELECT node_name, dbclerk FROM node_resources WHERE dbclerk='t';
node_name | dbclerk
------------------+---------
v_vmart_node0001 | t
(1 row)
You can view the results of the dry run in the system table LDAP_LINK_DRYRUN_EVENTS.
To cancel an in-progress synchronization, use LDAP_LINK_SYNC_CANCEL.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
LDAP_LINK_DRYRUN_SYNC (
'LDAPLinkURL',
'LDAPLinkBindDN',
'LDAPLinkBindPswd',
'LDAPLinkSearchBase',
'LDAPLinkScope',
'LDAPLinkFilterUser',
'LDAPLinkFilterGroup',
'LDAPLinkUserName',
'LDAPLinkGroupName',
'LDAPLinkGroupMembers',
[LDAPLinkSearchTimeout],
['LDAPLinkJoinAttr']
)
Privileges
Superuser
Examples
To perform a dry run to map the users and groups returned from LDAP_LINK_DRYRUN_SEARCH, pass the same parameters as arguments to LDAP_LINK_DRYRUN_SYNC.
=> SELECT LDAP_LINK_DRYRUN_SYNC('ldap://example.dc.com','CN=amir,OU=QA,DC=dc,DC=com','$vertica$','dc=DC,dc=com','sub',
'(objectClass=person)','(objectClass=group)','uid','cn','member',10,'dn');
LDAP_LINK_DRYRUN_SYNC
------------------------------------------------------------------------------------------
Dry Run Connect and Sync Completed. Query v_monitor.ldap_link_dryrun_events for results.
To check the results of the sync, query the system table LDAP_LINK_DRYRUN_EVENTS.
=> SELECT event_timestamp, event_type, entry_name, ldapurihash, link_scope, search_base from LDAP_LINK_DRYRUN_EVENTS;
event_timestamp | event_type | entry_name | ldapurihash | link_scope | search_base
---------------------------------+---------------------+------------------------+-------------+------------+--------------
2020-01-03 21:08:30.883783+05:30 | BIND_STARTED | ---------------------- | 0 | sub | dc=DC,dc=com
2020-01-03 21:08:30.890574+05:30 | BIND_FINISHED | ---------------------- | 0 | sub | dc=DC,dc=com
2020-01-03 21:08:30.890602+05:30 | SYNC_STARTED | ---------------------- | 0 | sub | dc=DC,dc=com
2020-01-03 21:08:30.890605+05:30 | SEARCH_STARTED | ********** | 0 | sub | dc=DC,dc=com
2020-01-03 21:08:31.939369+05:30 | LDAP_GROUP_FOUND | Account Operators | 0 | sub | dc=DC,dc=com
2020-01-03 21:08:31.939395+05:30 | LDAP_GROUP_FOUND | Administrators | 0 | sub | dc=DC,dc=com
2020-01-03 21:08:31.939461+05:30 | LDAP_USER_FOUND | user1 | 0 | sub | dc=DC,dc=com
2020-01-03 21:08:31.939463+05:30 | LDAP_USER_FOUND | user2 | 0 | sub | dc=DC,dc=com
2020-01-03 21:08:31.939468+05:30 | SEARCH_FINISHED | ********** | 0 | sub | dc=DC,dc=com
2020-01-03 21:08:31.939718+05:30 | PROCESSING_STARTED | ********** | 0 | sub | dc=DC,dc=com
2020-01-03 21:08:31.939887+05:30 | USER_CREATED | user1 | 0 | sub | dc=DC,dc=com
2020-01-03 21:08:31.939895+05:30 | USER_CREATED | user2 | 0 | sub | dc=DC,dc=com
2020-01-03 21:08:31.939949+05:30 | ROLE_CREATED | Account Operators | 0 | sub | dc=DC,dc=com
2020-01-03 21:08:31.939959+05:30 | ROLE_CREATED | Administrators | 0 | sub | dc=DC,dc=com
2020-01-03 21:08:31.940603+05:30 | PROCESSING_FINISHED | ********** | 0 | sub | dc=DC,dc=com
2020-01-03 21:08:31.940613+05:30 | SYNC_FINISHED | ---------------------- | 0 | sub | dc=DC,dc=com
See also
8.4 - LDAP_LINK_SYNC_CANCEL
Cancels in-progress LDAP Link synchronizations (including those started by LDAP_LINK_DRYRUN_SYNC) between the LDAP server and Vertica.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
ldap_link_sync_cancel()
Privileges
Superuser
Examples
=> SELECT ldap_link_sync_cancel();
See also
8.5 - LDAP_LINK_SYNC_START
Begins the synchronization between the LDAP server and Vertica immediately rather than waiting for the interval set in LDAPLinkInterval.
The dryrun and LDAP_LINK_SYNC_START functions must be run from the clerk node. To determine the clerk node, query NODE_RESOURCES:
=> SELECT node_name, dbclerk FROM node_resources WHERE dbclerk='t';
node_name | dbclerk
------------------+---------
v_vmart_node0001 | t
(1 row)
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
ldap_link_sync_start()
Privileges
Superuser
Examples
=> SELECT ldap_link_sync_start();
See also
LDAP link parameters9 - License functions
This section contains functions that monitor Vertica license status and compliance.
9.1 - AUDIT
Returns the raw data size (in bytes) of a database, schema, or table as it is counted in an audit of the database size. Unless you specify zero error tolerance and 100 percent confidence level, AUDIT
returns only approximate results that can vary over multiple iterations.
Important
The data size returned byAUDIT
should not be compared with the compressed data size of objects reported in the USED_BYTES
column of system tables like STORAGE_CONTAINERS and PROJECTION_STORAGE.
AUDIT
estimates the size for data in Vertica tables using the same data sampling method that Vertica uses to determine if a database complies with the licensed database size allowance. Vertica does not use these results to determine whether the size of the database complies with the Vertica license's data allowance. For details, see Auditing database size.
For data stored in external tables based on ORC or Parquet format, AUDIT
uses the total size of the data files. This value is never estimated—it is read from the file system storing the ORC or Parquet files (either the Vertica node's local file system, S3, or HDFS).
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
AUDIT('[[[database.]schema.]scope ]'[, 'granularity'] [, error-tolerance[, confidence-level]] )
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.scope
- Specifies the extent of the audit:
-
Empty string (
''
) audits the entire database. -
The name of the schema or table to audit.
The schema or table to audit. To audit the database, set this parameter to an empty string.
-
granularity
- The level at which the audit reports its results, one of the following strings:
-
database
-
schema
-
table
The level of granularity must be equal to or less than the granularity of
scope
. If you omit this parameter, granularity is set to the same level asscope
. Thus, ifonline_sales
is a schema, the following statements are identical:AUDIT('online_sales', 'schema'); AUDIT('online_sales');
If
AUDIT
sets granularity to a level lower than the target object, it returns with a message that refers you to system tableUSER_AUDITS
. For details, see Querying V_CATALOG.USER_AUDITS, below. -
error-tolerance
- Specifies the percentage margin of error allowed in the audit estimate. Enter the tolerance value as a decimal number, between 0 and 100. The default value is 5, for a 5% margin of error.
This argument has no effect on audits of external tables based on ORC or Parquet files. Audits of these tables always returns the actual size of the underlying data files.
Setting this value to 0 results in a full database audit, which is very resource intensive, as
AUDIT
analyzes the entire database. A full database audit significantly impacts performance, so Vertica does not recommend it for a production database.Caution
Due to the iterative sampling that the auditing process uses, setting the error tolerance to a small fraction of a percent (for example, 0.00001) can causeAUDIT
to run for a longer period than a full database audit. The lower you specify this value, the more resources the audit uses, as it performs more data sampling. confidence-level
- Specifies the statistical confidence level percentage of the estimate. Enter the confidence value as a decimal number, between 0 and 100. The default value is 99, indicating a confidence level of 99%.
This argument has no effect on audits of external tables based on ORC or Parquet files. Audits of these tables always returns the actual size of the underlying data files.
The higher the confidence value, the more resources the function uses, as it performs more data sampling. Setting this value to 100 results in a full audit of the database, which is very resource intensive, as the function analyzes all of the database. A full database audit significantly impacts performance, so Vertica does not recommend it for a production database.
Privileges
Superuser, or the following privileges:
-
SELECT privilege on the target tables
-
USAGE privilege on the target schemas
Note
If you audit a schema or the database, Vertica only returns the size of all objects that you have privileges to access within the audited object, as described above.Querying V_CATALOG.USER_AUDITS
If AUDIT
sets granularity to a level lower than the target object, it returns with a message that refers you to system table
USER_AUDITS
. To obtain audit data on objects of the specified granularity, query this table. For example, the following query seeks to audit all tables in the store
schema:
=> SELECT AUDIT('store', 'table');
AUDIT
-----------------------------------------------------------
See table sizes in v_catalog.user_audits for schema store
(1 row)
The next query queries USER_AUDITS
and obtains the latest audits on those tables:
=> SELECT object_name, AVG(size_bytes)::int size_bytes, MAX(audit_start_timestamp::date) audit_start
FROM user_audits WHERE object_schema='store'
GROUP BY rollup(object_name) HAVING GROUPING_ID(object_name) < 1 ORDER BY GROUPING_ID();
object_name | size_bytes | audit_start
-------------------+------------+-------------
store_dimension | 22067 | 2017-10-26
store_orders_fact | 27201312 | 2017-10-26
store_sales_fact | 301260170 | 2017-10-26
(3 rows)
Examples
9.2 - AUDIT_FLEX
Returns the estimated ROS size of __raw__
columns, equivalent to the export size of the flex data in the audited objects. You can audit all flex data in the database, or narrow the audit scope to a specific flex table, projection, or schema. Vertica stores the audit results in system table
USER_AUDITS
.
The audit excludes the following:
-
Flex keys
-
Other columns in the audited tables.
-
Temporary flex tables
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
AUDIT_FLEX ('[scope]')
Parameters
scope
- Specifies the extent of the audit:
-
Empty string (
''
) audits all flexible tables in the database. -
The name of a schema, projection, or flex table.
-
Privileges
Superuser, or the following privileges:
-
SELECT privilege on the target tables
-
USAGE privilege on the target schemas
Note
If you audit a schema or the database, Vertica only returns the size of all objects that you have privileges to access within the audited object, as described above.Examples
Audit all flex tables in the current database:
dbs=> select audit_flex('');
audit_flex
------------
8567679
(1 row)
Audit the flex tables in schema public
:
dbs=> select audit_flex('public');
audit_flex
------------
8567679
(1 row)
Audit the flex data in projection bakery_b0
:
dbs=> select audit_flex('bakery_b0');
audit_flex
------------
8566723
(1 row)
Audit flex table bakery
:
dbs=> select audit_flex('bakery');
audit_flex
------------
8566723
(1 row)
To report the results of all audits saved in the USER_AUDITS
, the following shows part of an extended display from the system table showing an audit run on a schema called test
, and the entire database, dbs
:
dbs=> \x
Expanded display is on.
dbs=> select * from user_audits;
-[ RECORD 1 ]-------------------------+------------------------------
size_bytes | 0
user_id | 45035996273704962
user_name | release
object_id | 45035996273736664
object_type | SCHEMA
object_schema |
object_name | test
audit_start_timestamp | 2014-02-04 14:52:15.126592-05
audit_end_timestamp | 2014-02-04 14:52:15.139475-05
confidence_level_percent | 99
error_tolerance_percent | 5
used_sampling | f
confidence_interval_lower_bound_bytes | 0
confidence_interval_upper_bound_bytes | 0
sample_count | 0
cell_count | 0
-[ RECORD 2 ]-------------------------+------------------------------
size_bytes | 38051
user_id | 45035996273704962
user_name | release
object_id | 45035996273704974
object_type | DATABASE
object_schema |
object_name | dbs
audit_start_timestamp | 2014-02-05 13:44:41.11926-05
audit_end_timestamp | 2014-02-05 13:44:41.227035-05
confidence_level_percent | 99
error_tolerance_percent | 5
used_sampling | f
confidence_interval_lower_bound_bytes | 38051
confidence_interval_upper_bound_bytes | 38051
sample_count | 0
cell_count | 0
-[ RECORD 3 ]-------------------------+------------------------------
...
9.3 - AUDIT_LICENSE_SIZE
Triggers an immediate audit of the database size to determine if it is in compliance with the raw data storage allowance included in your Vertica licenses.
If you use ORC or Parquet data stored in HDFS, results are only accurate if you run this function as a user who has access to all HDFS data. Either run the query with a principal that has read access to all such data, or use a Hadoop delegation token that grants this access. For more information about using delegation tokens, see Accessing kerberized HDFS data.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
AUDIT_LICENSE_SIZE()
Privileges
Superuser
Examples
=> SELECT audit_license_size();
audit_license_size
--------------------
Raw Data Size: 0.00TB +/- 0.00TB
License Size : 10.00TB
Utilization : 0%
Audit Time : 2015-09-24 12:19:15.425486-04
Compliance Status : The database is in compliance with respect to raw data size.
License End Date: 2015-11-23 00:00:00 Days Remaining: 60.53
(1 row)
9.4 - AUDIT_LICENSE_TERM
Triggers an immediate audit to determine if the Vertica license has expired.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
AUDIT_LICENSE_TERM()
Privileges
Superuser
Examples
=> SELECT audit_license_term();
audit_license_term
--------------------
Raw Data Size: 0.00TB +/- 0.00TB
License Size : 10.00TB
Utilization : 0%
Audit Time : 2015-09-24 12:19:15.425486-04
Compliance Status : The database is in compliance with respect to raw data size.
License End Date: 2015-11-23 00:00:00 Days Remaining: 60.53
(1 row)
9.5 - DISPLAY_LICENSE
Returns the terms of your Vertica license. The information this function displays is:
-
The start and end dates for which the license is valid (or "Perpetual" if the license has no expiration).
-
The number of days you are allowed to use Vertica after your license term expires (the grace period)
-
The amount of data your database can store, if your license includes a data allowance.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
DISPLAY_LICENSE()
Privileges
None
Examples
=> SELECT DISPLAY_LICENSE();
DISPLAY_LICENSE
---------------------------------------------------
Vertica Systems, Inc.
2007-08-03
Perpetual
500GB
(1 row)
9.6 - GET_AUDIT_TIME
Reports the time when the automatic audit of database size occurs. Vertica performs this audit if your Vertica license includes a data size allowance. For details of this audit, see Managing licenses in the Administrator's Guide. To change the time the audit runs, use the SET_AUDIT_TIME function.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
GET_AUDIT_TIME()
Privileges
None
Examples
=> SELECT get_audit_time();
get_audit_time
-----------------------------------------------------
The audit is scheduled to run at 11:59 PM each day.
(1 row)
9.7 - GET_COMPLIANCE_STATUS
Displays whether your database is in compliance with your Vertica license agreement. This information includes the results of Vertica's most recent audit of the database size (if your license has a data allowance as part of its terms), the license term (if your license has an end date), and the number of nodes (if your license has a node limit).
GET_COMPLIANCE_STATUS
measures data allowance by TBs (where a TB equals 10244 bytes).
The information displayed by GET_COMPLIANCE_STATUS
includes:
-
The estimated size of the database (see Auditing database size for an explanation of the size estimate).
-
The raw data size allowed by your Vertica license.
-
The percentage of your allowance that your database is currently using.
-
The number of nodes and license limit.
-
The date and time of the last audit.
-
Whether your database complies with the data allowance terms of your license agreement.
-
The end date of your license.
-
How many days remain until your license expires.
Note
If your license does not have a data allowance, end date, or node limit, some of the values might not appear in the output forGET_COMPLIANCE_STATUS
.
If the audit shows your license is not in compliance with your data allowance, you should either delete data to bring the size of the database under the licensed amount, or upgrade your license. If your license term has expired, you should contact Vertica immediately to renew your license. See Managing licenses for further details.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
GET_COMPLIANCE_STATUS()
Privileges
None
Examples
=> SELECT GET_COMPLIANCE_STATUS();
get_compliance_status
--------------------
Raw Data Size: 0.00TB +/- 0.00TB
License Size : 10.00TB
Utilization : 0%
Audit Time : 2015-09-24 12:19:15.425486-04
Compliance Status : The database is in compliance with respect to raw data size.
License End Date: 2015-11-23 00:00:00 Days Remaining: 60.53
(1 row)
The following example shows output for a Vertica for SQL on Apache Hadoop cluster.
=> SELECT GET_COMPLIANCE_STATUS();
get_compliance_status
--------------------
Node count : 4
License Node limit : 5
No size-compliance concerns for an Unlimited license
No expiration date for a Perpetual license
(1 row)
9.8 - SET_AUDIT_TIME
Sets the time that Vertica performs automatic database size audit to determine if the size of the database is compliant with the raw data allowance in your Vertica license. Use this function if the audits are currently scheduled to occur during your database's peak activity time. This is normally not a concern, since the automatic audit has little impact on database performance.
Audits are scheduled by the preceding audit, so changing the audit time does not affect the next scheduled audit. For example, if your next audit is scheduled to take place at 11:59PM and you use SET_AUDIT_TIME to change the audit schedule 3AM, the previously scheduled 11:59PM audit still runs. As that audit finishes, it schedules the next audit to occur at 3AM.
Vertica always performs the next scheduled audit even where you have changed the audit time using SET_AUDIT_TIME and then triggered an automatic audit by issuing the statement, SELECT AUDIT_LICENSE_SIZE. Only after the next scheduled audit does Vertica begin auditing at the new time you set using SET_AUDIT_TIME. Thereafter, Vertica audits at the new time.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SET_AUDIT_TIME(time)
time
- A string containing the time in
'HH:MM AM/PM'
format (for example,'1:00 AM'
) when the audit should run daily.
Privileges
Superuser
Examples
=> SELECT SET_AUDIT_TIME('3:00 AM');
SET_AUDIT_TIME
-----------------------------------------------------------------------
The scheduled audit time will be set to 3:00 AM after the next audit.
(1 row)
10 - Notifier functions
This section contains functions for using and managing the notifier.
10.1 - GET_DATA_COLLECTOR_NOTIFY_POLICY
Lists any notification policies set on a Data collector component.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
GET_DATA_COLLECTOR_NOTIFY_POLICY('component')
component
- Name of the Data Collector component to check for notification policies.
Query system table DATA_COLLECTOR for component names. For example:
=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component; component | description ----------------+------------------------------- DepotEvictions | Files evicted from the Depot DepotFetches | Files fetched to the Depot DepotUploads | Files Uploaded from the Depot (3 rows)
Examples
=> SELECT GET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures');
GET_DATA_COLLECTOR_NOTIFY_POLICY
----------------------------------------------------------------------
Notifiable; Notifier: vertica_stats; Channel: vertica_notifications
(1 row)
The following example shows the output from the function when there is no notification policy for the component:
=> SELECT GET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures');
GET_DATA_COLLECTOR_NOTIFY_POLICY
----------------------------------
Not notifiable;
(1 row)
See also
10.2 - NOTIFY
Sends a specified message to a NOTIFIER.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
NOTIFY ( 'message', 'notifier', 'target-topic' )
Parameters
message
- The message to send to the endpoint.
notifier
- The name of the NOTIFIER.
target-topic
- String that specifies one of the following based on the
notifier
type:-
Kafka: The name of an existing destination Kafka topic for the message.
Note
If the topic doesn't already exist, you can configure your Kafka broker to automatically create the specified topic. For more information, see the Kafka documentataion. -
Syslog: The
ProblemDescription
subject andchannel
value. -
SNS: The topic ARN.
-
Privileges
Superuser
Examples
Send a message to confirm that an ETL job is complete:
=> SELECT NOTIFY('ETL Done!', 'my_notifier', 'DB_activity_topic');
10.3 - SET_DATA_COLLECTOR_NOTIFY_POLICY
Creates/enables notification policies for a Data collector component. Notification policies automatically send messages to the specified NOTIFIER when certain events occur.
To view existing notification policies on a Data Collector component, see GET_DATA_COLLECTOR_NOTIFY_POLICY.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SET_DATA_COLLECTOR_NOTIFY_POLICY('component','notifier', 'topic', enabled)
component
- Name of the component whose change will be reported via the notifier.
Query system table DATA_COLLECTOR for component names. For example:
=> SELECT DISTINCT component, description FROM data_collector WHERE component ilike '%Depot%' ORDER BY component; component | description ----------------+------------------------------- DepotEvictions | Files evicted from the Depot DepotFetches | Files fetched to the Depot DepotUploads | Files Uploaded from the Depot (3 rows)
notifier
- Name of the notifier that will send the message.
topic
- One of the following:
-
Kafka: The name of the Kafka topic that will receive the notification message.
Note
If the topic doesn't already exist, you can configure your Kafka broker to automatically create the specified topic. For more information, see the Kafka documentataion. -
Syslog: The subject of the field
ProblemDescription
. -
SNS: The topic ARN.
-
enabled
- Boolean value that specifies whether this policy is enabled. Set to TRUE to enable reporting component changes. Set to FALSE to disable the notifier.
Examples
SNS notifier
The following example creates an SNS topic, subscribes to it with an SQS queue, and then configures an SNS notifier for the DC component LoginFailures
:
-
Set SNSAuth with your AWS credentials:
=> ALTER DATABASE DEFAULT SET SNSAuth='VNDDNVOPIUQF917O5PDB:+mcnVONVIbjOnf1ekNis7nm3mE83u9fjdwmlq36Z';
-
Set SNSRegion:
=> ALTER DATABASE DEFAULT SET SNSRegion='us-east-1'
-
Enable HTTPS:
=> ALTER DATABASE DEFAULT SET SNSEnableHttps=1;
-
Create an SNS notifier:
=> CREATE NOTIFIER v_sns_notifier ACTION 'sns' MAXPAYLOAD '256K' MAXMEMORYSIZE '10M' CHECK COMMITTED;
-
Verify that the SNS notifier, SNS topic, and SQS queue are properly configured:
-
Manually send a message from the notifier to the SNS topic with NOTIFY:
=> SELECT NOTIFY('test message', 'v_sns_notifier', 'arn:aws:sns:us-east-1:123456789012:MyTopic')
-
Poll the SQS queue for your message.
-
-
Attach the SNS notifier to the
LoginFailures
component with SET_DATA_COLLECTOR_NOTIFY_POLICY:=> SELECT SET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures', 'v_sns_notifier', 'Login failed!', true)
Kafka notifier
To be notified of failed login attempts, you can create a notifier that sends a notification when the DC component LoginFailures
updates. The TLSMODE
'verify-ca' verifies that the server's certificate is signed by a trusted CA.
=> CREATE NOTIFIER vertica_stats ACTION 'kafka://kafka01.example.com:9092' MAXMEMORYSIZE '10M' TLSMODE 'verify-ca';
CREATE NOTIFIER
=> SELECT SET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures','vertica_stats', 'vertica_notifications', true);
SET_DATA_COLLECTOR_NOTIFY_POLICY
----------------------------------
SET
(1 row)
The following example shows how to disable the policy created in the previous example:
=> SELECT SET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures','vertica_stats', 'vertica_notifications', false);
SET_DATA_COLLECTOR_NOTIFY_POLICY
----------------------------------
SET
(1 row)
=> SELECT GET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures');
GET_DATA_COLLECTOR_NOTIFY_POLICY
----------------------------------
Not notifiable;
(1 row)
Syslog notifier
The following example creates a notifier that writes a message to syslog when the Data collector (DC) component LoginFailures
updates:
-
Enable syslog notifiers for the current database:
=> ALTER DATABASE DEFAULT SET SyslogEnabled = 1;
-
Create and enable a syslog notifier
v_syslog_notifier
:=> CREATE NOTIFIER v_syslog_notifier ACTION 'syslog' ENABLE MAXMEMORYSIZE '10M' IDENTIFIED BY 'f8b0278a-3282-4e1a-9c86-e0f3f042a971' PARAMETERS 'eventSeverity = 5';
-
Configure the syslog notifier
v_syslog_notifier
for updates to theLoginFailures
DC component with SET_DATA_COLLECTOR_NOTIFY_POLICY:=> SELECT SET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures','v_syslog_notifier', 'Login failed!', true);
This notifier writes the following message to syslog (default location:
/var/log/messages
) when a user fails to authenticate as the userBob
:Apr 25 16:04:58 vertica_host_01 vertica: Event Posted: Event Code:21 Event Id:0 Event Severity: Notice [5] PostedTimestamp: 2022-04-25 16:04:58.083063 ExpirationTimestamp: 2022-04-25 16:04:58.083063 EventCodeDescription: Notifier ProblemDescription: (Login failed!) { "_db":"VMart", "_schema":"v_internal", "_table":"dc_login_failures", "_uuid":"f8b0278a-3282-4e1a-9c86-e0f3f042a971", "authentication_method":"Reject", "client_authentication_name":"default: Reject", "client_hostname":"::1", "client_label":"", "client_os_user_name":"dbadmin", "client_pid":523418, "client_version":"", "database_name":"dbadmin", "effective_protocol":"3.8", "node_name":"v_vmart_node0001", "reason":"REJECT", "requested_protocol":"3.8", "ssl_client_fingerprint":"", "ssl_client_subject":"", "time":"2022-04-25 16:04:58.082568-05", "user_name":"Bob" }#012 DatabaseName: VMart Hostname: vertica_host_01
See also
11 - Partition functions
This section contains partition management functions specific to Vertica.
11.1 - CALENDAR_HIERARCHY_DAY
Specifies to group DATE
partition keys into a hierarchy of years, months, and days. The Vertica Tuple Mover regularly evaluates partition keys against the current date, and merges partitions as needed into the appropriate year and month partition groups.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CALENDAR_HIERARCHY_DAY( partition-expression[, active-months[, active-years] ] )
Parameters
partition-expression
- The DATE expression on which to group partition keys, which must be identical to the table's
PARTITION BY
expression. active-months
- An integer ≥ 0 that specifies how many months preceding
MONTH(CURRENT_DATE)
to store unique partition keys in separate partitions.If you specify 1, only partition keys of the current month are stored in separate partitions.
If you specify 0, all partition keys of the current month are merged into a partition group for that month.
For details, see Hierarchical partitioning.
Default: 2
active-years
- An integer ≥ 0, specifies how many years preceding
YEAR(CURRENT_DATE)
to partition group keys by month in separate partitions.If you specify 1, only partition keys of the current year are stored in month partition groups.
If you specify 0, all partition keys of the current and previous years are merged into year partition groups.
For details, see Hierarchical partitioning.
Default: 2
Important
The CALENDAR_HIERARCHY_DAY algorithm assumes that most table activity is focused on recent dates. Setting active-years
and active-months
to a low number ≥ 2 serves to isolate most merge activity to date-specific containers, and incurs minimal overhead. Vertica recommends that you use the default setting of 2 for active-years
and active-months
. For most users, these settings achieve an optimal balance between ROS storage and performance.
As a best practice, never set active-years
and active-months
to 0.
Usage
Specify this function in a table partition clause, as its GROUP BY
expression:
PARTITION BY partition-expression
GROUP BY CALENDAR_HIERARCHY_DAY(
group-expression
[, active-months[, active-years] ] )
For example:
=> CREATE TABLE public.store_orders
(
order_no int,
order_date timestamp NOT NULL,
shipper varchar(20),
ship_date date
);
...
=> ALTER TABLE public.store_orders
PARTITION BY order_date::DATE
GROUP BY CALENDAR_HIERARCHY_DAY(order_date::DATE, 3, 2) REORGANIZE;
Examples
11.2 - COPY_PARTITIONS_TO_TABLE
Copies partitions from one table to another. This lightweight partition copy increases performance by initially sharing the same storage between two tables. After the copy operation is complete, the tables are independent of each other. Users can perform operations on one table without impacting the other. These operations can increase the overall storage required for both tables.
Note
Although they share storage space, Vertica considers the partitions as discrete objects for license capacity purposes. For example, copying a one TB partition would only consume one TB of space. Your Vertica license, however, considers them as separate objects consuming two TB of space.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
COPY_PARTITIONS_TO_TABLE (
'[[database.]schema.]source-table',
'min-range-value',
'max-range-value',
'[[database.]schema.]target-table'
[, 'force-split']
)
Arguments
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.source-table
- The source table of the partitions to copy.
min-range-value
,max-range-value
- The minimum and maximum value of partition keys to copy, where
min‑range‑value
must be ≤max‑range‑value
. To specify a single partition key,min‑range‑value
andmax‑range‑value
must be equal. target-table
- The target table of the partitions to copy. If the table does not exist, Vertica creates a table from the source table's definition, by calling
CREATE TABLE
withLIKE
andINCLUDING PROJECTIONS
clause. The new table inherits ownership from the source table. For details, see Replicating a table. force-split
Optional Boolean argument, specifies whether to split ROS containers if the range of partition keys spans multiple containers or part of a single container:
-
true
: Split ROS containers as needed. -
false
(default): Return with an error if ROS containers must be split to implement this operation.
-
Privileges
Non-superuser, one of the following:
-
Owner of source and target tables
-
TRUNCATE (if force-split is true) and SELECT on the source table, INSERT on the target table
If the target table does not exist, you must also have CREATE privileges on the target schema to enable table creation.
Table attribute requirements
The following attributes of both tables must be identical:
-
Column definitions, including NULL/NOT NULL constraints
-
Segmentation
-
Number of projections
-
Projection sort order
-
Primary and unique key constraints. However, the key constraints do not have to be identically enabled. For more information on constraints, see Constraints.
Note
If the target table has primary or unique key constraints enabled and copying or moving the partitions will insert duplicate key values into the target table, Vertica rolls back the operation. -
Check constraints. For MOVE_PARTITIONS_TO_TABLE and COPY_PARTITIONS_TO_TABLE, Vertica enforces enabled check constraints on the target table only. For SWAP_PARTITIONS_BETWEEN_TABLES, Vertica enforces enabled check constraints on both tables. If there is a violation of an enabled check constraint, Vertica rolls back the operation.
-
Number and definitions of text indices.
Additionally, If access policies exist on the source table, the following must be true:
-
Access policies on both tables must be identical.
-
One of the following must be true:
-
The executing user owns the source table.
-
AccessPolicyManagementSuperuserOnly
is set to true. See Managing access policies for details.
-
Table restrictions
The following restrictions apply to the source and target tables:
-
If the source and target partitions are in different storage tiers, Vertica returns a warning but the operation proceeds. The partitions remain in their existing storage tier.
-
The target table cannot be immutable.
-
The following tables cannot be used as sources or targets:
-
Temporary tables
-
Virtual tables
-
System tables
-
External tables
-
Examples
If you call COPY_PARTITIONS_TO_TABLE
and the target table does not exist, the function creates the table automatically. In the following example, the target table partn_backup.tradfes_200801
does not exist. COPY_PARTITIONS_TO_TABLE
creates the table and replicates the partition. Vertica also copies all the constraints associated with the source table except foreign key constraints.
=> SELECT COPY_PARTITIONS_TO_TABLE (
'prod_trades',
'200801',
'200801',
'partn_backup.trades_200801');
COPY_PARTITIONS_TO_TABLE
-------------------------------------------------
1 distinct partition values copied at epoch 15.
(1 row)
See also
Archiving partitions11.3 - DROP_PARTITIONS
Note
This function supersedes meta-function DROP_PARTITION, which was deprecated in Vertica 9.0.Drops the specified table partition keys.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
DROP_PARTITIONS (
'[[database.]schema.]table-name',
'min-range-value',
'max-range-value'
[, 'force-split']
)
Arguments
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.table-name
- The target table. The table cannot be used as a dimension table in a pre-join projection and cannot have out-of-date (unrefreshed) projections.
min-range-value
,max-range-value
- The minimum and maximum value of partition keys to drop, where
min‑range‑value
must be ≤max‑range‑value
. To specify a single partition key,min‑range‑value
andmax‑range‑value
must be equal. force-split
Optional Boolean argument, specifies whether to split ROS containers if the range of partition keys spans multiple containers or part of a single container:
-
true
: Split ROS containers as needed. -
false
(default): Return with an error if ROS containers must be split to implement this operation.
Note
In rare cases, DROP_PARTITIONS executes at the same time as a mergeout operation on the same ROS container. As a result, the function cannot split the container as specified and returns with an error. When this happens, call DROP_PARTITIONS again.-
Privileges
One of the following:
-
DBADMIN
-
Table owner
-
USAGE privileges on the table schema and TRUNCATE privileges on the table
Examples
See Dropping partitions.
See also
PARTITION_TABLE11.4 - DUMP_PROJECTION_PARTITION_KEYS
Dumps the partition keys of the specified projection.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
DUMP_PROJECTION_PARTITION_KEYS( '[[database.]schema.]projection-name')
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.projection-name
- Projection name
Privileges
Non-superuser: TRUNCATE on anchor table
Examples
The following statements create the table and projection online_sales.online_sales_fact
and online_sales.online_sales_fact_rep
, respectively, and partitions table data by the column call_center_key
:
=> CREATE TABLE online_sales.online_sales_fact
(
sale_date_key int NOT NULL,
ship_date_key int NOT NULL,
product_key int NOT NULL,
product_version int NOT NULL,
customer_key int NOT NULL,
call_center_key int NOT NULL,
online_page_key int NOT NULL,
shipping_key int NOT NULL,
warehouse_key int NOT NULL,
promotion_key int NOT NULL,
pos_transaction_number int NOT NULL,
sales_quantity int,
sales_dollar_amount float,
ship_dollar_amount float,
net_dollar_amount float,
cost_dollar_amount float,
gross_profit_dollar_amount float,
transaction_type varchar(16)
)
PARTITION BY (online_sales_fact.call_center_key);
=> CREATE PROJECTION online_sales.online_sales_fact_rep AS SELECT * from online_sales.online_sales_fact unsegmented all nodes;
The following DUMP_PROJECTION_PARTITION_KEYS statement dumps the partition key from the projection online_sales.online_sales_fact_rep
:
=> SELECT DUMP_PROJECTION_PARTITION_KEYS('online_sales.online_sales_fact_rep');
Partition keys on node v_vmart_node0001
Projection 'online_sales_fact_rep'
Storage [ROS container]
No of partition keys: 1
Partition keys: 200
Storage [ROS container]
No of partition keys: 1
Partition keys: 199
...
Storage [ROS container]
No of partition keys: 1
Partition keys: 2
Storage [ROS container]
No of partition keys: 1
Partition keys: 1
Partition keys on node v_vmart_node0002
Projection 'online_sales_fact_rep'
Storage [ROS container]
No of partition keys: 1
Partition keys: 200
Storage [ROS container]
No of partition keys: 1
Partition keys: 199
...
(1 row)
See also
11.5 - DUMP_TABLE_PARTITION_KEYS
Dumps the partition keys of all projections for the specified table.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
DUMP_TABLE_PARTITION_KEYS ( '[[database.]schema.]table-name' )
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.table-name
- Name of the table
Privileges
Non-superuser: TRUNCATE on table
Examples
The following example creates a simple table called states
and partitions the data by state:
=> CREATE TABLE states (year INTEGER NOT NULL,
state VARCHAR NOT NULL)
PARTITION BY state;
=> CREATE PROJECTION states_p (state, year) AS
SELECT * FROM states
ORDER BY state, year UNSEGMENTED ALL NODES;
Now dump the partition keys of all projections anchored on table states
:
=> SELECT DUMP_TABLE_PARTITION_KEYS( 'states' );
DUMP_TABLE_PARTITION_KEYS --------------------------------------------------------------------------------------------
Partition keys on node v_vmart_node0001
Projection 'states_p'
Storage [ROS container]
No of partition keys: 1
Partition keys: VT
Storage [ROS container]
No of partition keys: 1
Partition keys: PA
Storage [ROS container]
No of partition keys: 1
Partition keys: NY
Storage [ROS container]
No of partition keys: 1
Partition keys: MA
Partition keys on node v_vmart_node0002
...
(1 row)
See also
11.6 - MOVE_PARTITIONS_TO_TABLE
Moves partitions from one table to another.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
MOVE_PARTITIONS_TO_TABLE (
'[[database.]schema.]source-table',
'min-range-value',
'max-range-value',
'[[database.]schema.]target-table'
[, force-split]
)
Arguments
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.source-table
- The source table of the partitions to move.
min-range-value
,max-range-value
- The minimum and maximum value of partition keys to move, where
min‑range‑value
must be ≤max‑range‑value
. To specify a single partition key,min‑range‑value
andmax‑range‑value
must be equal. target-table
- The target table of the partitions to move. If the table does not exist, Vertica creates a table from the source table's definition, by calling
CREATE TABLE
withLIKE
andINCLUDING PROJECTIONS
clause. The new table inherits ownership from the source table. For details, see Replicating a table. force-split
Optional Boolean argument, specifies whether to split ROS containers if the range of partition keys spans multiple containers or part of a single container:
-
true
: Split ROS containers as needed. -
false
(default): Return with an error if ROS containers must be split to implement this operation.
-
Privileges
Non-superuser, one of the following:
-
Owner of source and target tables
-
SELECT, TRUNCATE on the source table, INSERT on the target table
If the target table does not exist, you must also have CREATE privileges on the target schema to enable table creation.
Table attribute requirements
The following attributes of both tables must be identical:
-
Column definitions, including NULL/NOT NULL constraints
-
Segmentation
-
Number of projections
-
Projection sort order
-
Primary and unique key constraints. However, the key constraints do not have to be identically enabled. For more information on constraints, see Constraints.
Note
If the target table has primary or unique key constraints enabled and copying or moving the partitions will insert duplicate key values into the target table, Vertica rolls back the operation. -
Check constraints. For MOVE_PARTITIONS_TO_TABLE and COPY_PARTITIONS_TO_TABLE, Vertica enforces enabled check constraints on the target table only. For SWAP_PARTITIONS_BETWEEN_TABLES, Vertica enforces enabled check constraints on both tables. If there is a violation of an enabled check constraint, Vertica rolls back the operation.
-
Number and definitions of text indices.
Additionally, If access policies exist on the source table, the following must be true:
-
Access policies on both tables must be identical.
-
One of the following must be true:
-
The executing user owns the source table.
-
AccessPolicyManagementSuperuserOnly
is set to true. See Managing access policies for details.
-
Table restrictions
The following restrictions apply to the source and target tables:
-
If the source and target partitions are in different storage tiers, Vertica returns a warning but the operation proceeds. The partitions remain in their existing storage tier.
-
The target table cannot be immutable.
-
The following tables cannot be used as sources or targets:
-
Temporary tables
-
Virtual tables
-
System tables
-
External tables
-
Examples
See Archiving partitions.
See also
11.7 - PARTITION_PROJECTION
Splits ROS containers for a specified projection. PARTITION_PROJECTION
also purges data while partitioning ROS containers if deletes were applied before the AHM epoch.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
PARTITION_PROJECTION ( '[[database.]schema.]projection')
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.projection
``- The projection to partition.
Privileges
-
Table owner
-
USAGE privilege on schema
Examples
In this example, PARTITION_PROJECTION
forces a split of ROS containers on the states_p
projection:
=> SELECT PARTITION_PROJECTION ('states_p');
PARTITION_PROJECTION
------------------------
Projection partitioned
(1 row)
See also
11.8 - PARTITION_TABLE
Invokes the Tuple Mover to reorganize ROS storage containers as needed to conform with the current partitioning policy.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
PARTITION_TABLE ( '[schema.]table-name')
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.table-name
- The table to partition.
Privileges
-
Table owner
-
USAGE privilege on schema
Restrictions
-
You cannot run
PARTITION_TABLE
on a table that is an anchor table for a live aggregate projection or a Top-K projection. -
To reorganize storage to conform to a new policy, run
PARTITION_TABLE
after changing the partition GROUP BY expression.
See also
11.9 - PURGE_PARTITION
Purges a table partition of deleted rows. Similar to PURGE
and PURGE_PROJECTION
, this function removes deleted data from physical storage so you can reuse the disk space. PURGE_PARTITION
removes data only from the AHM epoch and earlier.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
PURGE_PARTITION ( '[[database.]schema.]table', partition-key )
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.table
- The partitioned table to purge.
partition-key
- The key of the partition to purge.
Privileges
-
Table owner
-
USAGE privilege on schema
Examples
The following example lists the count of deleted rows for each partition in a table, then calls PURGE_PARTITION()
to purge the deleted rows from the data.
=> SELECT partition_key,table_schema,projection_name,sum(deleted_row_count)
AS deleted_row_count FROM partitions
GROUP BY partition_key,table_schema,projection_name
ORDER BY partition_key;
partition_key | table_schema | projection_name | deleted_row_count
---------------+--------------+-----------------+-------------------
0 | public | t_super | 2
1 | public | t_super | 2
2 | public | t_super | 2
3 | public | t_super | 2
4 | public | t_super | 2
5 | public | t_super | 2
6 | public | t_super | 2
7 | public | t_super | 2
8 | public | t_super | 2
9 | public | t_super | 1
(10 rows)
=> SELECT PURGE_PARTITION('t',5); -- Purge partition with key 5.
purge_partition
------------------------------------------------------------------------
Task: merge partitions
(Table: public.t) (Projection: public.t_super)
(1 row)
=> SELECT partition_key,table_schema,projection_name,sum(deleted_row_count)
AS deleted_row_count FROM partitions
GROUP BY partition_key,table_schema,projection_name
ORDER BY partition_key;
partition_key | table_schema | projection_name | deleted_row_count
---------------+--------------+-----------------+-------------------
0 | public | t_super | 2
1 | public | t_super | 2
2 | public | t_super | 2
3 | public | t_super | 2
4 | public | t_super | 2
5 | public | t_super | 0
6 | public | t_super | 2
7 | public | t_super | 2
8 | public | t_super | 2
9 | public | t_super | 1
(10 rows)
See also
11.10 - SWAP_PARTITIONS_BETWEEN_TABLES
Swaps partitions between two tables.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SWAP_PARTITIONS_BETWEEN_TABLES (
'[[database.]schema.]staging-table',
'min-range-value',
'max-range-value',
'[[database.]schema.]target-table'
[, force-split]
)
Arguments
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.staging-table
- The staging table from which to swap partitions.
min-range-value
,max-range-value
- The minimum and maximum value of partition keys to swap, where
min‑range‑value
must be ≤max‑range‑value
. To specify a single partition key,min‑range‑value
andmax‑range‑value
must be equal. target-table
- The table to which the partitions are to be swapped. The target table cannot be the same as the staging table.
force-split
Optional Boolean argument, specifies whether to split ROS containers if the range of partition keys spans multiple containers or part of a single container:
-
true
: Split ROS containers as needed. -
false
(default): Return with an error if ROS containers must be split to implement this operation.
-
Privileges
Non-superuser, one of the following:
-
Owner of source and target tables
-
Target and source tables: TRUNCATE, INSERT, SELECT
Requirements
The following attributes of both tables must be identical:
-
Column definitions, including NULL/NOT NULL constraints
-
Segmentation
-
Number of projections
-
Projection sort order
-
Primary and unique key constraints. However, the key constraints do not have to be identically enabled. For more information on constraints, see Constraints.
Note
If the target table has primary or unique key constraints enabled and copying or moving the partitions will insert duplicate key values into the target table, Vertica rolls back the operation. -
Check constraints. For MOVE_PARTITIONS_TO_TABLE and COPY_PARTITIONS_TO_TABLE, Vertica enforces enabled check constraints on the target table only. For SWAP_PARTITIONS_BETWEEN_TABLES, Vertica enforces enabled check constraints on both tables. If there is a violation of an enabled check constraint, Vertica rolls back the operation.
-
Number and definitions of text indices.
Additionally, If access policies exist on the source table, the following must be true:
-
Access policies on both tables must be identical.
-
One of the following must be true:
-
The executing user owns the target table.
-
AccessPolicyManagementSuperuserOnly
is set to true.
-
Restrictions
The following restrictions apply to the source and target tables:
-
If the source and target partitions are in different storage tiers, Vertica returns a warning but the operation proceeds. The partitions remain in their existing storage tier.
-
The target table cannot be immutable.
-
The following tables cannot be used as sources or targets:
-
Temporary tables
-
Virtual tables
-
System tables
-
External tables
-
Examples
See Swapping partitions.
12 - Privileges and access functions
This section contains functions for managing user and role privileges, and access policies.
12.1 - ENABLED_ROLE
Checks whether a Vertica user role is enabled, and returns true or false. This function is typically used when you create access policies on database roles.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
ENABLED_ROLE ( 'role' )
Parameters
role
- The role to evaluate.
Privileges
None
Examples
See:
See also
CREATE ACCESS POLICY12.2 - GET_PRIVILEGES_DESCRIPTION
Returns the effective privileges the current user has on an object, including explicit, implicit, inherited, and role-based privileges.
Because this meta-function only returns effective privileges, GET_PRIVILEGES_DESCRIPTION only returns privileges with fully-satisfied prerequisites. For a list of prerequisites for common operations, see Privileges required for common database operations.
For example, a user must have the following privileges to query a table:
-
Schema: USAGE
-
Table: SELECT
If user Brooke has SELECT privileges on table s1.t1
but lacks USAGE privileges on schema s1
, Brooke cannot query the table, and GET_PRIVILEGES_DESCRIPTION does not return SELECT as a privilege for the table.
Note
Inherited privileges are not displayed if privilege inheritance is disabled at the database level with DisableInheritedPrivileges.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
GET_PRIVILEGES_DESCRIPTION( 'type', '[[database.]schema.]name' );
Parameters
type
- Specifies an object type, one of the following:
-
database
-
table
-
schema
-
view
-
sequence
-
model
-
library
-
resource pool
-
[
database
.]
schema
- Specifies a database and schema, by default the current database and
public
, respectively. name
- Name of the target object
Privileges
None
Examples
In the following example, user Glenn has set the REPORTER role and wants to check his effective privileges on schema s1
and table s1.articles
.
-
Table
s1.articles
inherits privileges from its schema (s1
). -
The REPORTER role has the following privileges:
-
SELECT on schema
s1
-
INSERT WITH GRANT OPTION on table
s1.articles
-
-
User Glenn has the following privileges:
-
UPDATE and USAGE on schema
s1
. -
DELETE on table
s1.articles
.
-
GET_PRIVILEGES_DESCRIPTION returns the following effective privileges for Glenn on schema s1
:
=> SELECT GET_PRIVILEGES_DESCRIPTION('schema', 's1');
GET_PRIVILEGES_DESCRIPTION
--------------------------------
SELECT, UPDATE, USAGE
(1 row)
GET_PRIVILEGES_DESCRIPTION returns the following effective privileges for Glenn on table s1.articles
:
=> SELECT GET_PRIVILEGES_DESCRIPTION('table', 's1.articles');
GET_PRIVILEGES_DESCRIPTION
--------------------------------
INSERT*, SELECT, UPDATE, DELETE
(1 row)
See also
12.3 - HAS_ROLE
Checks whether a Vertica user role is granted to the specified user or role, and returns true or false.
You can also query system tables ROLES, GRANTS, and USERS to obtain information on users and their role assignments. For details, see Viewing user roles.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
StableSyntax
HAS_ROLE( [ 'grantee' ,] 'verify-role' );
Parameters
grantee
- Valid only for superusers, specifies the name of a user or role to look up. If this argument is omitted, the function uses the current user name (
CURRENT_USER
). If you specify a role, Vertica checks whether this role is granted to the role specified inverify-role
.Important
If a non-superuser supplies this argument, Vertica returns an error. verify-role
- Name of the role to verify for
grantee
.
Privileges
None
Examples
In the following example, a dbadmin
user checks whether user MikeL
is assigned the admnistrator
role:
=> \c
You are now connected as user "dbadmin".
=> SELECT HAS_ROLE('MikeL', 'administrator');
HAS_ROLE
----------
t
(1 row)
User MikeL
checks whether he has the regional_manager
role:
=> \c - MikeL
You are now connected as user "MikeL".
=> SELECT HAS_ROLE('regional_manager');
HAS_ROLE
----------
f
(1 row)
The dbadmin grants the regional_manager
role to the administrator
role. On checking again, MikeL
verifies that he now has the regional_manager
role:
dbadmin=> \c
You are now connected as user "dbadmin".
dbadmin=> GRANT regional_manager to administrator;
GRANT ROLE
dbadmin=> \c - MikeL
You are now connected as user "MikeL".
dbadmin=> SELECT HAS_ROLE('regional_manager');
HAS_ROLE
----------
t
(1 row)
See also
12.4 - RELEASE_SYSTEM_TABLES_ACCESS
Allows non-superusers to access all non-SUPERUSER_ONLY system tables. After you call this function, Vertica ignores the IS_ACCESSIBLE_DURING_LOCKDOWN setting in table SYSTEM_TABLES. To restrict non-superusers access to system tables, call RESTRICT_SYSTEM_TABLES_ACCESS.
By default, the database behaves as though RELEASE_SYSTEM_TABLES_ACCESS() was called. That is, non-superusers have access to all non-SUPERUSER_ONLY system tables.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
RELEASE_SYSTEM_TABLES_ACCESS()
Privileges
Superuser
Examples
By default, non-superuser Alice has access to client_auth
and disk_storage
. She also has access to replication_status
because she was granted the privilege by the dbadmin:
=> SELECT table_name, is_superuser_only, is_accessible_during_lockdown FROM system_tables WHERE table_name='disk_storage' OR table_name='database_backups' OR table_name='replication_status' OR table_name='client_auth';
table_name | is_superuser_only | is_accessible_during_lockdown
--------------------+-------------------+-------------------------------
client_auth | f | t
disk_storage | f | f
database_backups | t | f
replication_status | t | t
(4 rows)
The dbadmin calls RESTRICT_SYSTEM_TABLES_ACCESS:
=> SELECT RESTRICT_SYSTEM_TABLES_ACCESS();
RESTRICT_SYSTEM_TABLES_ACCESS
----------------------------------------------------------------------------
Dropped grants to public on non-accessible during lockdown system tables.
(1 row)
Alice loses access to disk_storage
, but she retains access to client_auth
and replication_status
because their IS_ACCESSIBLE_DURING_LOCKDOWN fields are true:
=> SELECT storage_status FROM disk_storage;
ERROR 4367: Permission denied for relation disk_storage
The dbadmin calls RELEASE_SYSTEM_TABLES_ACCESS(), restoring Alice's access to disk_storage
:
=> SELECT RELEASE_SYSTEM_TABLES_ACCESS();
RELEASE_SYSTEM_TABLES_ACCESS
--------------------------------------------------------
Granted SELECT privileges on system tables to public.
(1 row)
12.5 - RESTRICT_SYSTEM_TABLES_ACCESS
Prevents non-superusers from accessing tables that have the IS_ACCESSIBLE_DURING_LOCKDOWN flag set to false.
To enable non-superuser access to system tables restricted by this function, call RELEASE_SYSTEM_TABLES_ACCESS.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
RESTRICT_SYSTEM_TABLES_ACCESS()
Privileges
Superuser
Examples
By default, client_auth
and disk_storage
tables are accessible to all users, but only the former is accessible after RESTRICT_SYSTEM_TABLES_ACCESS() is called. Non-superusers never have access to database_backups
and replication_status
unless explicitly granted the privilege by the dbadmin:
=> SELECT table_name, is_superuser_only, is_accessible_during_lockdown FROM system_tables WHERE table_name='disk_storage' OR table_name='database_backups' OR table_name='replication_status' OR table_name='client_auth';
table_name | is_superuser_only | is_accessible_during_lockdown
--------------------+-------------------+-------------------------------
client_auth | f | t
disk_storage | f | f
database_backups | t | f
replication_status | t | t
(4 rows)
The dbadmin then calls RESTRICT_SYSTEM_TABLES_ACCESS():
=> SELECT RESTRICT_SYSTEM_TABLES_ACCESS();
RESTRICT_SYSTEM_TABLES_ACCESS
----------------------------------------------------------------------------
Dropped grants to public on non-accessible during lockdown system tables.
(1 row)
Bob loses access to disk_storage
, but retains access to client_auth
because its IS_ACCESSIBLE_DURING_LOCKDOWN field is true:
=> SELECT storage_status FROM disk_storage;
ERROR 4367: Permission denied for relation disk_storage
=> SELECT auth_oid FROM client_auth;
auth_oid
-------------------
45035996273705106
45035996273705110
45035996273705114
(3 rows)
13 - Projection functions
This section contains projection management functions specific to Vertica.
See also
13.1 - CLEAR_PROJECTION_REFRESHES
Clears projection refresh history from the PROJECTION_REFRESHES system table. PROJECTION_REFRESHES records information about successful and unsuccessful refresh operations.
CLEAR_PROJECTION_REFRESHES removes information only for refresh operations that are complete, as indicated by the IS_EXECUTING column in PROJECTION_REFRESHES.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CLEAR_PROJECTION_REFRESHES()
Privileges
Superuser
Examples
=> SELECT CLEAR_PROJECTION_REFRESHES();
CLEAR_PROJECTION_REFRESHES
----------------------------
CLEAR
(1 row)
See also
13.2 - EVALUATE_DELETE_PERFORMANCE
Evaluates projections for potential DELETE and UPDATE performance issues. If Vertica finds any issues, it issues a warning message. When evaluating multiple projections, EVALUATE_DELETE_PERFORMANCE returns up to ten projections with issues, and the name of a table that lists all issues that it found.
Note
EVALUATE_DELETE_PERFORMANCE returns messages that specifically reference delete performance. Keep in mind, however, that delete and update operations benefit equally from the same optimizations.For information on resolving delete and update performance issues, see Optimizing DELETE and UPDATE.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
EVALUATE_DELETE_PERFORMANCE ( ['[[database.]schema.]scope'] )
Parameters
-
`[database.]schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.scope
- Specifies the projections to evaluate, one of the following:
-
[
table
.]
projection
Evaluateprojection
. For example:SELECT EVALUATE_DELETE_PERFORMANCE('store.store_orders_fact.store_orders_fact_b1');
-
table
Specifies to evaluate all projections oftable
. For example:SELECT EVALUATE_DELETE_PERFORMANCE('store.store_orders_fact');
If you supply no arguments, EVALUATE_DELETE_PERFORMANCE evaluates all projections that you can access. Depending on the size of your database, this can incur considerable overhead.
-
Privileges
Non-superuser: SELECT privilege on the anchor table
Examples
EVALUATE_DELETE_PERFORMANCE evaluates all projections of table exampl
e for potential DELETE and UPDATE performance issues.
=> create table example (A int, B int,C int);
CREATE TABLE
=> create projection one_sort (A,B,C) as (select A,B,C from example) order by A;
CREATE PROJECTION
=> create projection two_sort (A,B,C) as (select A,B,C from example) order by A,B;
CREATE PROJECTION
=> select evaluate_delete_performance('example');
evaluate_delete_performance
---------------------------------------------------
No projection delete performance concerns found.
(1 row)
The previous example show that the two projections one_sort and two_sort have no inherent structural issues that might cause poor DELETE performance. However, the data contained within the projection can create potential delete issues if the sorted columns do not uniquely identify a row or small number of rows.
In the following example, Perl is used to populate the table with data using a nested series of loops:
-
The inner loop populates column
C.
-
The middle loop populates column
B
. -
The outer loop populates column
A
.
The result is column A
contains only three distinct values (0, 1, and 2), while column B
slowly varies between 20 and 0 and column C
changes in each row:
=> \! perl -e 'for ($i=0; $i<3; $i++) { for ($j=0; $j<21; $j++) { for ($k=0; $k<19; $k++) { printf "%d,%d,%d\n", $i,$j,$k;}}}' | /opt/vertica/bin/vsql -c "copy example from stdin delimiter ',' direct;"
Password:
=> select * from example;
A | B | C
---+----+----
0 | 20 | 18
0 | 20 | 17
0 | 20 | 16
0 | 20 | 15
0 | 20 | 14
0 | 20 | 13
0 | 20 | 12
0 | 20 | 11
0 | 20 | 10
0 | 20 | 9
0 | 20 | 8
0 | 20 | 7
0 | 20 | 6
0 | 20 | 5
0 | 20 | 4
0 | 20 | 3
0 | 20 | 2
0 | 20 | 1
0 | 20 | 0
0 | 19 | 18
...
2 | 1 | 0
2 | 0 | 18
2 | 0 | 17
2 | 0 | 16
2 | 0 | 15
2 | 0 | 14
2 | 0 | 13
2 | 0 | 12
2 | 0 | 11
2 | 0 | 10
2 | 0 | 9
2 | 0 | 8
2 | 0 | 7
2 | 0 | 6
2 | 0 | 5
2 | 0 | 4
2 | 0 | 3
2 | 0 | 2
2 | 0 | 1
2 | 0 | 0
=> SELECT COUNT (*) FROM example;
COUNT
-------
1197
(1 row)
=> SELECT COUNT (DISTINCT A) FROM example;
COUNT
-------
3
(1 row)
EVALUATE_DELETE_PERFORMANCE is run against the projections again to determine whether the data within the projections causes any potential DELETE performance issues. Projection one_sort
has potential delete issues as it only sorts on column A which has few distinct values. Each value in the sort column corresponds to many rows in the projection, which can adversely impact DELETE performance. In contrast, projection two_sort
is sorted on columns A
and B
, where each combination of values in the two sort columns identifies just a few rows, so deletes can be performed faster:
=> select evaluate_delete_performance('example');
evaluate_delete_performance
---------------------------------------------------
The following projections exhibit delete performance concerns:
"public"."one_sort_b1"
"public"."one_sort_b0"
See v_catalog.projection_delete_concerns for more details.
=> \x
Expanded display is on.
dbadmin=> select * from projection_delete_concerns;
-[ RECORD 1 ]------+------------------------------------------------------------------------------------------------------------------------------------------------------------
projection_id | 45035996273878562
projection_schema | public
projection_name | one_sort_b1
creation_time | 2019-06-17 13:59:03.777085-04
last_modified_time | 2019-06-17 14:00:27.702223-04
comment | The squared number of rows matching each sort key is about 159201 on average.
-[ RECORD 2 ]------+------------------------------------------------------------------------------------------------------------------------------------------------------------
projection_id | 45035996273878548
projection_schema | public
projection_name | one_sort_b0
creation_time | 2019-06-17 13:59:03.777279-04
last_modified_time | 2019-06-17 13:59:03.777279-04
comment | The squared number of rows matching each sort key is about 159201 on average.
If you omit supplying an argument to EVALUATE_DELETE_PERFORMANCE, it evaluates all projections that you can access:
=> select evaluate_delete_performance();
evaluate_delete_performance
---------------------------------------------------------------------------
The following projections exhibit delete performance concerns:
"public"."one_sort_b0"
"public"."one_sort_b1"
See v_catalog.projection_delete_concerns for more details.
(1 row)
13.3 - GET_PROJECTION_SORT_ORDER
Returns the order of columns in a projection's ORDER BY clause.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
GET_PROJECTION_SORT_ORDER( '[[database.]schema.]projection' );
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.projection
- The target projection.
Privileges
Non-superuser: SELECT privilege on the anchor table
Examples
=> SELECT get_projection_sort_order ('store_orders_super');
get_projection_sort_order
--------------------------------------------------------------------------------------------
public.store_orders_super [Sort Cols: "order_no", "order_date", "shipper", "ship_date"]
(1 row)
13.4 - GET_PROJECTION_STATUS
Returns information relevant to the status of a projection:
-
The current K-safety status of the database
-
The number of nodes in the database
-
Whether the projection is segmented
-
The number and names of buddy projections
-
Whether the projection is safe
-
Whether the projection is up to date
-
Whether statistics have been computed for the projection
Use
GET_PROJECTION_STATUS
to monitor the progress of a projection data refresh.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
GET_PROJECTION_STATUS ( '[[database.]schema.]projection' );
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.projection
- The projection for which to display status.
Examples
=> SELECT GET_PROJECTION_STATUS('public.customer_dimension_site01');
GET_PROJECTION_STATUS
-----------------------------------------------------------------------------------------------
Current system K is 1.
# of Nodes: 4.
public.customer_dimension_site01 [Segmented: No] [Seg Cols: ] [K: 3] [public.customer_dimension_site04, public.customer_dimension_site03,
public.customer_dimension_site02]
[Safe: Yes] [UptoDate: Yes][Stats: Yes]
13.5 - GET_PROJECTIONS
Returns contextual and projection information about projections of the specified anchor table.
- Contextual information
-
Database K-safety
-
Number of database nodes
-
Number of projections for this table
-
- Projection data
- For each projection, specifies:
-
All buddy projections
-
Whether it is segmented
-
Whether it is safe
-
Whether it is up-to-date.
-
You can also use GET_PROJECTIONS
to monitor the progress of a projection data refresh.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
GET_PROJECTIONS ( '[[database.]schema-name.]table' )
Parameters
-
`[database.]schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.table
- Anchor table of the projections to list.
Privileges
None
Examples
The following example gets information about projections for VMart table store.store_dimension
:
=> SELECT GET_PROJECTIONS('store.store_dimension');
-[ RECORD 1 ]---+
GET_PROJECTIONS | Current system K is 1.
# of Nodes: 3.
Table store.store_dimension has 2 projections.
Projection Name: [Segmented] [Seg Cols] [# of Buddies] [Buddy Projections] [Safe] [UptoDate] [Stats]
----------------------------------------------------------------------------------------------------
store.store_dimension_b1 [Segmented: Yes] [Seg Cols: "store.store_dimension.store_key"] [K: 1] [store.store_dimension_b0] [Safe: Yes] [UptoDate: Yes] [Stats: RowCounts]
store.store_dimension_b0 [Segmented: Yes] [Seg Cols: "store.store_dimension.store_key"] [K: 1] [store.store_dimension_b1] [Safe: Yes] [UptoDate: Yes] [Stats: RowCounts]
13.6 - PURGE_PROJECTION
Permanently removes deleted data from physical storage so disk space can be reused. You can purge historical data up to and including the Ancient History Mark epoch.
Caution
PURGE_PROJECTION
can use significant disk space while purging the data.
See
PURGE
for details about purge operations.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
PURGE_PROJECTION ( '[[database.]schema.]projection' )
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.projection
- The projection to purge.
Privileges
-
Table owner
-
USAGE privilege on schema
Examples
The following example purges all historical data in projection tbl_p
that precedes the Ancient History Mark epoch.
=> CREATE TABLE tbl (x int, y int);
CREATE TABLE
=> INSERT INTO tbl VALUES(1,2);
OUTPUT
--------
1
(1 row)
=> INSERT INTO tbl VALUES(3,4);
OUTPUT
--------
1
(1 row)
dbadmin=> COMMIT;
COMMIT
=> CREATE PROJECTION tbl_p AS SELECT x FROM tbl UNSEGMENTED ALL NODES;
WARNING 4468: Projection <public.tbl_p> is not available for query processing.
Execute the select start_refresh() function to copy data into this projection.
The projection must have a sufficient number of buddy projections and all nodes must be up before starting a refresh
CREATE PROJECTION
=> SELECT START_REFRESH();
START_REFRESH
----------------------------------------
Starting refresh background process.
=> DELETE FROM tbl WHERE x=1;
OUTPUT
--------
1
(1 row)
=> COMMIT;
COMMIT
=> SELECT MAKE_AHM_NOW();
MAKE_AHM_NOW
-------------------------------
AHM set (New AHM Epoch: 9066)
(1 row)
=> SELECT PURGE_PROJECTION ('tbl_p');
PURGE_PROJECTION
-------------------
Projection purged
(1 row)
See also
13.7 - REFRESH
Synchronously refreshes one or more table projections in the foreground, and updates the PROJECTION_REFRESHES system table. If you run REFRESH with no arguments, it refreshes all projections that contain stale data.
To understand projection refreshing in detail, see Refreshing projections.
If a refresh would violate a table or schema disk quota, the operation fails. For more information, see Disk quotas.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
REFRESH ( [ '[[database.]schema.]table[,...]' ] )
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.table
- The anchor table of the projections to refresh. If you specify multiple tables, REFRESH attempts to refresh them in parallel. Such calls are part of the Database Designer deployment (and deployment script).
Returns
Note
If REFRESH does not refresh any projections, it returns a header string with no results.Column | Returns |
---|---|
Projection Name |
The projection targeted for refresh. |
Anchor Table |
The projection's associated anchor table. |
Status |
Projections' refresh status:
|
Refresh Method |
Method used to refresh the projection. |
Error Count |
Number of times a refresh failed for the projection. |
Duration (sec) |
How long (in seconds) the projection refresh ran. |
Privileges
- Superuser
-
Owner of the specified tables
Refresh methods
Vertica can refresh a projection from one of its buddies, if one is available. In this case, the target projection gets the source buddy's historical data. Otherwise, the projection is refreshed from scratch with data of the latest epoch at the time of the refresh operation. In this case, the projection cannot participate in historical queries on any epoch that precedes the refresh operation.
Vertica can perform incremental refreshes when the following conditions are met:
-
The table being refreshed is partitioned.
-
The table does not contain any unpartitioned data.
-
The operation is a full projection refresh (not a partition range projection refresh).
In an incremental refresh, the refresh operation first loads data from the partition with the highest range of keys. After refreshing this partition, Vertica begins to refresh the partition with next highest partition range. This process continues until all projection partitions are refreshed. While the refresh operation is in progress, projection partitions that have completed the refresh process become available to process query requests.
The method used to refresh a given projection is recorded in the REFRESH_METHOD column of the PROJECTION_REFRESHES system table.
Examples
The following example refreshes the projections in two tables:
=> SELECT REFRESH('t1, t2');
REFRESH
----------------------------------------------------------------------------------------
Refresh completed with the following outcomes:
Projection Name: [Anchor Table] [Status] [Refresh Method] [Error Count] [Duration (sec)]
----------------------------------------------------------------------------------------
"public"."t1_p": [t1] [refreshed] [scratch] [0] [0]"public"."t2_p": [t2] [refreshed] [scratch] [0] [0]
In the following example, only the projection on one table was refreshed:
=> SELECT REFRESH('allow, public.deny, t');
REFRESH
----------------------------------------------------------------------------------------
Refresh completed with the following outcomes:
Projection Name: [Anchor Table] [Status] [Refresh Method] [Error Count] [Duration (sec)]
----------------------------------------------------------------------------------------
"n/a"."n/a": [n/a] [failed: insufficient permissions on table "allow"] [] [1] [0]
"n/a"."n/a": [n/a] [failed: insufficient permissions on table "public.deny"] [] [1] [0]
"public"."t_p1": [t] [refreshed] [scratch] [0] [0]
See also
13.8 - REFRESH_COLUMNS
Refreshes table columns that are defined with the constraint SET USING or DEFAULT USING. All refresh operations associated with a call to REFRESH_COLUMNS belong to the same transaction. Thus, all tables and columns specified by REFRESH_COLUMNS must be refreshed; otherwise, the entire operation is rolled back.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
REFRESH_COLUMNS ( 'table-list', '[column-list]'
[, '[refresh-mode]' [, min-partition-key, max-partition-key [, force-split] ]
)
Arguments
table-list
- A comma-delimited list of the tables to refresh:
[[
database
.]
schema.
]
table
[,...]
If you specify multiple tables,
refresh-mode
must be set to REBUILD. column-list
- A comma-delimited list of columns to refresh:
[[[
database
.]
schema.
]
table
.]
column
[,...]
or[[
database
.]
schema.
]
table
.
*
, where asterisk (*
) means to refresh all SET USING/DEFAULT USING columns in the table. For example:SELECT REFRESH_COLUMNS ('t1, t2', 't1.*, t2.b', 'REBUILD');
If
column-list
is set to an empty string (''
), REFRESH_COLUMNS refreshes all SET USING/DEFAULT USING columns in the specified tables.The following requirements apply:
-
All specified columns must have a SET USING or DEFAULT USING constraint.
-
If REFRESH_COLUMNS specifies multiple tables, all column names must be qualified by their table names. If the target tables span multiple schemas, all column names must be fully qualified by their schema and table names. For example:
SELECT REFRESH_COLUMNS ('t1, t2', 't1.a, t2.b', 'REBUILD');
If you specify a database, it must be the current database.
-
refresh-mode
- Specifies how to refresh SET USING columns:
-
UPDATE
(default): Marks original rows as deleted and replaces them with new rows. In order to save these updates, you must issue a COMMIT statement. -
REBUILD
: Replaces all data in the specified columns. The rebuild operation is auto-committed.
If you specify multiple tables, you must explicitly specify REBUILD mode.
In both cases, REFRESH_COLUMNS returns an error if any SET USING column is defined as a primary or unique key in a table that enforces those constraints.
See REBUILD Mode Restrictions for limitations on using the REBUILD option.
-
min-partition-key
,max-partition-key
- Qualifies REBUILD mode, limiting the rebuild operation to one or more partitions. To specify a range of partitions,
max-partition-key
must be greater thanmin-partition-key
. To update one partition, the two arguments must be equal.The following requirements apply:
-
The function can specify only one table to refresh.
-
The table must be partitioned on the specified keys.
You can use these arguments to refresh columns with recently loaded data—that is, data in the latest partitions. Using this option regularly can significantly minimize the overhead otherwise incurred by rebuilding entire columns in a large table.
See Partition-based REBUILD below for details.
-
force-split
- Boolean, whether to split ROS containers if the range of partition keys spans multiple containers or part of a single container:
-
true
(default): Split ROS containers as needed. -
false
: Return with an error if ROS containers must be split to implement this operation.
-
Privileges
-
Schemas of queried and flattened tables: USAGE
-
Queried table: SELECT
-
Flattened table: SELECT, UPDATE
UPDATE versus REBUILD modes
In general, UPDATE mode is a better choice when changes to SET USING column data are confined to a relatively small number of rows. Use REBUILD mode when a significant amount of SET USING column data is stale and must be updated. It is generally good practice to call REFRESH_COLUMNS with REBUILD on any new SET USING column—for example, to populate a SET USING column after adding it with ALTER TABLE...ADD COLUMN.
REBUILD mode restrictions
If you call REFRESH_COLUMNS on a SET USING column and specify the refresh mode as REBUILD, Vertica returns an error if the column is specified in any of the following:
-
Table's partition key
-
Unsegmented projection
-
Projection with expressions, or any live aggregate projection that invokes a user-defined transform function (UDTF)
-
Sort order or segmentation of any projection
-
Any projection that omits an anchor table column that is referenced in the column's SET USING expression
-
GROUPED clause of any projection
Partition-based REBUILD operations
If a flattened table is partitioned, you can reduce the overhead of calling REFRESH_COLUMNS in REBUILD mode, by specifying one or more partition keys. Doing so limits the rebuild operation to the specified partitions. For example, table public.orderFact
is defined with SET USING column cust_name
. This table is partitioned on column order_date
, where the partition clause invokes Vertica function CALENDAR_HIERARCHY_DAY. Thus, you can call REFRESH_COLUMNS on specific time-delimited partitions of this table—in this case, on orders over the last two months:
=> SELECT REFRESH_COLUMNS ('public.orderFact',
'cust_name',
'REBUILD',
TO_CHAR(ADD_MONTHS(current_date, -2),'YYYY-MM')||'-01',
TO_CHAR(LAST_DAY(ADD_MONTHS(current_date, -1))));
REFRESH_COLUMNS
---------------------------
refresh_columns completed
(1 row)
Rewriting SET USING queries
When you call REFRESH_COLUMNS on a flattened table's SET USING (or DEFAULT USING) column, it executes the SET USING query by joining the target and source tables. By default, the source table is always the inner table of the join. In most cases, cardinality of the source table is less than the target table, so REFRESH_COLUMNS executes the join efficiently.
Occasionally—notably, when you call REFRESH_COLUMNS on a partitioned table—the source table can be larger than the target table. In this case, performance of the join operation can be suboptimal.
You can address this issue by enabling configuration parameter RewriteQueryForLargeDim. When enabled (1), Vertica rewrites the query, by reversing the inner and outer join between the target and source tables.
Important
Enable this parameter only if the SET USING source data is in a table that is larger than the target table. If the source data is in a table smaller than the target table, then enabling RewriteQueryForLargeDim can adversely affect refresh performance.Examples
13.9 - START_REFRESH
Refreshes projections in the current schema with the latest data of their respective anchor tables. START_REFRESH runs asynchronously in the background, and updates the PROJECTION_REFRESHES system table. This function has no effect if a refresh is already running.
To refresh only projections of a specific table, use REFRESH. When you deploy a design through Database Designer, it automatically refreshes its projections.
If a refresh would violate a table or schema disk quota, the operation fails. For more information, see Disk quotas.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
START_REFRESH()
Privileges
None
Requirements
All nodes must be up.
Refresh methods
Vertica can refresh a projection from one of its buddies, if one is available. In this case, the target projection gets the source buddy's historical data. Otherwise, the projection is refreshed from scratch with data of the latest epoch at the time of the refresh operation. In this case, the projection cannot participate in historical queries on any epoch that precedes the refresh operation.
Vertica can perform incremental refreshes when the following conditions are met:
-
The table being refreshed is partitioned.
-
The table does not contain any unpartitioned data.
-
The operation is a full projection refresh (not a partition range projection refresh).
In an incremental refresh, the refresh operation first loads data from the partition with the highest range of keys. After refreshing this partition, Vertica begins to refresh the partition with next highest partition range. This process continues until all projection partitions are refreshed. While the refresh operation is in progress, projection partitions that have completed the refresh process become available to process query requests.
The method used to refresh a given projection is recorded in the REFRESH_METHOD column of the PROJECTION_REFRESHES system table.
Examples
=> SELECT START_REFRESH();
START_REFRESH
----------------------------------------
Starting refresh background process.
(1 row)
See also
14 - Session functions
This section contains session management functions specific to Vertica.
See also the SQL system table V_MONITOR.SESSIONS.
14.1 - CANCEL_REFRESH
Cancels refresh-related internal operations initiated by START_REFRESH and REFRESH.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CANCEL_REFRESH()
Privileges
None
Notes
-
Refresh tasks run in a background thread in an internal session, so you cannot use INTERRUPT_STATEMENT to cancel those statements. Instead, use CANCEL_REFRESH to cancel statements that are run by refresh-related internal sessions.
-
Run CANCEL_REFRESH() on the same node on which START_REFRESH() was initiated.
-
CANCEL_REFRESH() cancels the refresh operation running on a node, waits for the cancelation to complete, and returns SUCCESS.
-
Only one set of refresh operations runs on a node at any time.
Examples
Cancel a refresh operation executing in the background.
=> SELECT START_REFRESH();
START_REFRESH
----------------------------------------
Starting refresh background process.
(1 row)
=> SELECT CANCEL_REFRESH();
CANCEL_REFRESH
----------------------------------------
Stopping background refresh process.
(1 row)
See also
14.2 - CLOSE_ALL_SESSIONS
Closes all external sessions except the one that issues this function. Call this function before shutting down the Vertica database.
Vertica closes sessions asynchronously, so another session can open before this function returns. In this case, reissue this function. To view the status of all open sessions, query system table
SESSIONS
.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CLOSE_ALL_SESSIONS()
Privileges
Non-superuser: None to close your own session
Examples
Two user sessions are open on separate nodes:
=> SELECT * FROM sessions;
-[ RECORD 1 ]--------------+----------------------------------------------------
node_name | v_vmartdb_node0001
user_name | dbadmin
client_hostname | 127.0.0.1:52110
client_pid | 4554
login_timestamp | 2011-01-03 14:05:40.252625-05
session_id | stress04-4325:0x14
client_label |
transaction_start | 2011-01-03 14:05:44.325781
transaction_id | 45035996273728326
transaction_description | user dbadmin (select * from sessions;)
statement_start | 2011-01-03 15:36:13.896288
statement_id | 10
last_statement_duration_us | 14978
current_statement | select * from sessions;
ssl_state | None
authentication_method | Trust
-[ RECORD 2 ]--------------+----------------------------------------------------
node_name | v_vmartdb_node0002
user_name | dbadmin
client_hostname | 127.0.0.1:57174
client_pid | 30117
login_timestamp | 2011-01-03 15:33:00.842021-05
session_id | stress05-27944:0xc1a
client_label |
transaction_start | 2011-01-03 15:34:46.538102
transaction_id | -1
transaction_description | user dbadmin (COPY Mart_Fact FROM '/data/mart_Fact.tbl'
DELIMITER '|' NULL '\\n';)
statement_start | 2011-01-03 15:34:46.538862
statement_id |
last_statement_duration_us | 26250
current_statement | COPY Mart_Fact FROM '/data/Mart_Fact.tbl' DELIMITER '|'
NULL '\\n';
ssl_state | None
authentication_method | Trust
-[ RECORD 3 ]--------------+----------------------------------------------------
node_name | v_vmartdb_node0003
user_name | dbadmin
client_hostname | 127.0.0.1:56367
client_pid | 1191
login_timestamp | 2011-01-03 15:31:44.939302-05
session_id | stress06-25663:0xbec
client_label |
transaction_start | 2011-01-03 15:34:51.05939
transaction_id | 54043195528458775
transaction_description | user dbadmin (COPY Mart_Fact FROM '/data/Mart_Fact.tbl'
DELIMITER '|' NULL '\\n' DIRECT;)
statement_start | 2011-01-03 15:35:46.436748
statement_id |
last_statement_duration_us | 1591403
current_statement | COPY Mart_Fact FROM '/data/Mart_Fact.tbl' DELIMITER '|'
NULL '\\n' DIRECT;
ssl_state | None
authentication_method | Trust
Close all sessions:
=> \x
Expanded display is off.
=> SELECT CLOSE_ALL_SESSIONS();
CLOSE_ALL_SESSIONS
-------------------------------------------------------------------------
Close all sessions command sent. Check v_monitor.sessions for progress.
(1 row)
Session contents after issuing CLOSE_ALL_SESSIONS
:
=> SELECT * FROM SESSIONS;
-[ RECORD 1 ]--------------+----------------------------------------
node_name | v_vmartdb_node0001
user_name | dbadmin
client_hostname | 127.0.0.1:52110
client_pid | 4554
login_timestamp | 2011-01-03 14:05:40.252625-05
session_id | stress04-4325:0x14
client_label |
transaction_start | 2011-01-03 14:05:44.325781
transaction_id | 45035996273728326
transaction_description | user dbadmin (SELECT * FROM sessions;)
statement_start | 2011-01-03 16:19:56.720071
statement_id | 25
last_statement_duration_us | 15605
current_statement | SELECT * FROM SESSIONS;
ssl_state | None
authentication_method | Trust
See also
14.3 - CLOSE_SESSION
Interrupts the specified external session, rolls back the current transaction if any, and closes the socket. You can only close your own session.
It might take some time before a session is closed. To view the status of all open sessions, query the system table
SESSIONS
.
For detailed information about session management options, see Managing sessions.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CLOSE_SESSION ( 'sessionid')
Parameters
sessionid
- A string that specifies the session to close. This identifier is unique within the cluster at any point in time but can be reused when the session closes.
Privileges
None
Examples
User session opened. Record 2 shows the user session running a COPY DIRECT
statement.
=> SELECT * FROM sessions;
-[ RECORD 1 ]--------------+-----------------------------------------------
node_name | v_vmartdb_node0001
user_name | dbadmin
client_hostname | 127.0.0.1:52110
client_pid | 4554
login_timestamp | 2011-01-03 14:05:40.252625-05
session_id | stress04-4325:0x14
client_label |
transaction_start | 2011-01-03 14:05:44.325781
transaction_id | 45035996273728326
transaction_description | user dbadmin (SELECT * FROM sessions;)
statement_start | 2011-01-03 15:36:13.896288
statement_id | 10
last_statement_duration_us | 14978
current_statement | select * from sessions;
ssl_state | None
authentication_method | Trust
-[ RECORD 2 ]--------------+-----------------------------------------------
node_name | v_vmartdb_node0002
user_name | dbadmin
client_hostname | 127.0.0.1:57174
client_pid | 30117
login_timestamp | 2011-01-03 15:33:00.842021-05
session_id | stress05-27944:0xc1a
client_label |
transaction_start | 2011-01-03 15:34:46.538102
transaction_id | -1
transaction_description | user dbadmin (COPY ClickStream_Fact FROM
'/data/clickstream/1g/ClickStream_Fact.tbl'
DELIMITER '|' NULL '\\n' DIRECT;)
statement_start | 2011-01-03 15:34:46.538862
statement_id |
last_statement_duration_us | 26250
current_statement | COPY ClickStream_Fact FROM '/data/clickstream
/1g/ClickStream_Fact.tbl' DELIMITER '|' NULL
'\\n' DIRECT;
ssl_state | None
authentication_method | Trust
Close user session stress05-27944:0xc1a
=> \x
Expanded display is off.
=> SELECT CLOSE_SESSION('stress05-27944:0xc1a');
CLOSE_SESSION
--------------------------------------------------------------------
Session close command sent. Check v_monitor.sessions for progress.
(1 row)
Query the sessions table again for current status, and you can see that the second session has been closed:
=> SELECT * FROM SESSIONS;
-[ RECORD 1 ]--------------+--------------------------------------------
node_name | v_vmartdb_node0001
user_name | dbadmin
client_hostname | 127.0.0.1:52110
client_pid | 4554
login_timestamp | 2011-01-03 14:05:40.252625-05
session_id | stress04-4325:0x14
client_label |
transaction_start | 2011-01-03 14:05:44.325781
transaction_id | 45035996273728326
transaction_description | user dbadmin (select * from SESSIONS;)
statement_start | 2011-01-03 16:12:07.841298
statement_id | 20
last_statement_duration_us | 2099
current_statement | SELECT * FROM SESSIONS;
ssl_state | None
authentication_method | Trust
See also
14.4 - CLOSE_USER_SESSIONS
Stops the session for a user, rolls back any transaction currently running, and closes the connection. To determine the status of the sessions to close, query the
SESSIONS
table.
Note
Running this function on your own sessions leaves one session running.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CLOSE_USER_SESSIONS ( 'user-name' )
Parameters
user-name
- Specifies the user whose sessions are to be closed. If you specify your own user name, Vertica closes all sessions except the one in which you issue this function.
Privileges
Examples
This example closes all active session for user u1
:
=> SELECT close_user_sessions('u1');
See also
14.5 - GET_NUM_ACCEPTED_ROWS
Returns the number of rows loaded into the database for the last completed load for the current session. GET_NUM_ACCEPTED_ROWS is a meta-function. Do not use it as a value in an INSERT query.
The number of accepted rows is not available for a load that is currently in process. Check the LOAD_STREAMS system table for its status.
This meta-function supports loads from STDIN, COPY LOCAL from a Vertica client, or a single file on the initiator. You cannot use GET_NUM_ACCEPTED_ROWS for multi-node loads.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
GET_NUM_ACCEPTED_ROWS();
Privileges
None
Note
The data regarding accepted rows from the last load during the current session does not persist, and is lost when you initiate a new load.Examples
This examples shows the number of accepted rows from the vmart_load_data.sql meta-command.
=> \i vmart_load_data.sql;
=> SELECT GET_NUM_ACCEPTED_ROWS ();
GET_NUM_ACCEPTED_ROWS
-----------------------
300000
(1 row)
See also
14.6 - GET_NUM_REJECTED_ROWS
Returns the number of rows that were rejected during the last completed load for the current session. GET_NUM_REJECTED_ROWS is a meta-function. Do not use it as a value in an INSERT query.
Rejected row information is unavailable for a load that is currently running. The number of rejected rows is not available for a load that is currently in process. Check the LOAD_STREAMS system table for its status.
This meta-function supports loads from STDIN, COPY LOCAL from a Vertica client, or a single file on the initiator. You cannot use GET_NUM_REJECTED_ROWS for multi-node loads.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
GET_NUM_REJECTED_ROWS();
Privileges
None
Note
The data regarding rejected rows from the last load during the current session does not persist, and is dropped when you initiate a new load.Examples
This example shows the number of rejected rows from the vmart_load_data.sql meta-command.
=> \i vmart_load_data.sql
=> SELECT GET_NUM_REJECTED_ROWS ();
GET_NUM_REJECTED_ROWS
-----------------------
0
(1 row)
See also
14.7 - INTERRUPT_STATEMENT
Interrupts the specified statement in a user session, rolls back the current transaction, and writes a success or failure message to the log file.
Sessions can be interrupted during statement execution. Only statements run by user sessions can be interrupted.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
INTERRUPT_STATEMENT( 'session-id', statement-id)
Parameters
session-id
- Identifies the session to interrupt. This identifier is unique within the cluster at any point in time.
statement-id
- Identifies the statement to interrupt. If the
*
statement-id*
is valid, the statement can be interrupted andINTERRUPT_STATEMENT
returns a success message. Otherwise the system returns an error.
Privileges
Superuser
Messages
The following list describes messages you might encounter:
Message | Meaning |
---|---|
Statement interrupt sent. Check SESSIONS for progress. |
This message indicates success. |
Session <id> could not be successfully interrupted: session not found. |
The session ID argument to the interrupt command does not match a running session. |
Session <id> could not be successfully interrupted: statement not found. |
The statement ID does not match (or no longer matches) the ID of a running statement (if any). |
No interruptible statement running |
The statement is DDL or otherwise non-interruptible. |
Internal (system) sessions cannot be interrupted. |
The session is internal, and only statements run by external sessions can be interrupted. |
Examples
Two user sessions are open. RECORD 1 shows user session running SELECT FROM SESSION
, and RECORD 2 shows user session running COPY DIRECT
:
=> SELECT * FROM SESSIONS;
-[ RECORD 1 ]--------------+----------------------------------------------------
node_name | v_vmartdb_node0001
user_name | dbadmin
client_hostname | 127.0.0.1:52110
client_pid | 4554
login_timestamp | 2011-01-03 14:05:40.252625-05
session_id | stress04-4325:0x14
client_label |
transaction_start | 2011-01-03 14:05:44.325781
transaction_id | 45035996273728326
transaction_description | user dbadmin (select * from sessions;)
statement_start | 2011-01-03 15:36:13.896288
statement_id | 10
last_statement_duration_us | 14978
current_statement | select * from sessions;
ssl_state | None
authentication_method | Trust
-[ RECORD 2 ]--------------+----------------------------------------------------
node_name | v_vmartdb_node0003
user_name | dbadmin
client_hostname | 127.0.0.1:56367
client_pid | 1191
login_timestamp | 2011-01-03 15:31:44.939302-05
session_id | stress06-25663:0xbec
client_label |
transaction_start | 2011-01-03 15:34:51.05939
transaction_id | 54043195528458775
transaction_description | user dbadmin (COPY Mart_Fact FROM '/data/Mart_Fact.tbl'
DELIMITER '|' NULL '\\n' DIRECT;)
statement_start | 2011-01-03 15:35:46.436748
statement_id | 5
last_statement_duration_us | 1591403
current_statement | COPY Mart_Fact FROM '/data/Mart_Fact.tbl' DELIMITER '|'
NULL '\\n' DIRECT;
ssl_state | None
authentication_method | Trust
Interrupt the COPY DIRECT
statement running in session stress06-25663:0xbec
:
=> \x
Expanded display is off.
=> SELECT INTERRUPT_STATEMENT('stress06-25663:0x1537', 5);
interrupt_statement
------------------------------------------------------------------
Statement interrupt sent. Check v_monitor.sessions for progress.
(1 row)
Verify that the interrupted statement is no longer active by looking at the current_statement
column in the SESSIONS
system table. This column becomes blank when the statement is interrupted:
=> SELECT * FROM SESSIONS;
-[ RECORD 1 ]--------------+----------------------------------------------------
node_name | v_vmartdb_node0001
user_name | dbadmin
client_hostname | 127.0.0.1:52110
client_pid | 4554
login_timestamp | 2011-01-03 14:05:40.252625-05
session_id | stress04-4325:0x14
client_label |
transaction_start | 2011-01-03 14:05:44.325781
transaction_id | 45035996273728326
transaction_description | user dbadmin (select * from sessions;)
statement_start | 2011-01-03 15:36:13.896288
statement_id | 10
last_statement_duration_us | 14978
current_statement | select * from sessions;
ssl_state | None
authentication_method | Trust
-[ RECORD 2 ]--------------+----------------------------------------------------
node_name | v_vmartdb_node0003
user_name | dbadmin
client_hostname | 127.0.0.1:56367
client_pid | 1191
login_timestamp | 2011-01-03 15:31:44.939302-05
session_id | stress06-25663:0xbec
client_label |
transaction_start | 2011-01-03 15:34:51.05939
transaction_id | 54043195528458775
transaction_description | user dbadmin (COPY Mart_Fact FROM '/data/Mart_Fact.tbl'
DELIMITER '|' NULL '\\n' DIRECT;)
statement_start | 2011-01-03 15:35:46.436748
statement_id | 5
last_statement_duration_us | 1591403
current_statement |
ssl_state | None
authentication_method | Trust
See also
14.8 - RELEASE_ALL_JVM_MEMORY
Forces all sessions to release the memory consumed by their Java Virtual Machines (JVM).
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
RELEASE_ALL_JVM_MEMORY();
Privileges
Must be a superuser.
Examples
The following example demonstrates viewing the JVM memory use in all open sessions, then calling RELEASE_ALL_JVM_MEMORY() to release the memory:
=> select user_name,external_memory_kb FROM V_MONITOR.SESSIONS;
user_name | external_memory_kb
-----------+---------------
dbadmin | 79705
(1 row)
=> SELECT RELEASE_ALL_JVM_MEMORY();
RELEASE_ALL_JVM_MEMORY
-----------------------------------------------------------------------------
Close all JVM sessions command sent. Check v_monitor.sessions for progress.
(1 row)
=> SELECT user_name,external_memory_kb FROM V_MONITOR.SESSIONS;
user_name | external_memory_kb
-----------+---------------
dbadmin | 0
(1 row)
See also
14.9 - RELEASE_JVM_MEMORY
Terminates a Java Virtual Machine (JVM), making available the memory the JVM was using.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
RELEASE_JVM_MEMORY();
Privileges
None.
Examples
User session opened. RECORD 2 shows the user session running COPY DIRECT statement.
=> SELECT RELEASE_JVM_MEMORY();
release_jvm_memory
-----------------------------------------
Java process killed and memory released
(1 row)
See also
14.10 - RESERVE_SESSION_RESOURCE
Reserves memory resources from the general resource pool for the exclusive use of the Vertica backup and restore process. No other Vertica process can access reserved resources. If insufficient resources are available, Vertica queues the reservation request.
This meta-function is a session level reservation. When a session ends Vertica automatically releases any resources reserved in that session. Because the meta-function operates at the session level, the resource name does not need to be unique across multiple sessions.
You can view reserved resources by querying the SESSIONS table.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
RESERVE_SESSION_RESOURCE ( 'name', memory)
Parameters
name
- The name of the resource to reserve.
memory
- The amount of memory in kilobytes to allocate to the resource.
Privileges
None
Examples
Reserve 1024 kilobytes of memory for the backup and restore process:
=> SELECT reserve_session_resource('VBR_RESERVE',1024);
-[ RECORD 1 ]------------+----------------
reserve_session_resource | Grant succeed
14.11 - RESET_SESSION
Applies your default connection string configuration settings to your current session.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
RESET_SESSION()
Examples
The following example shows how you use RESET_SESSION.
Resets the current client connection string to the default connection string settings:
=> SELECT RESET_SESSION();
RESET_SESSION
----------------------
Reset session: done.
(1 row)
15 - Storage functions
This section contains storage management functions specific to Vertica.
15.1 - ALTER_LOCATION_LABEL
Adds a label to a storage location, or changes or removes an existing label. You can change a location label if it is not specified by any storage policy.
Caution
If you label a storage location that contains data, Vertica moves the data to an unlabeled location, if one exists. To prevent data movement between storage locations, labels should be applied either to all storage locations or none.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
ALTER_LOCATION_LABEL ( 'path' , '[node]' , '[location-label]' )
Parameters
path
- The storage location path.
node
- The node where the label change is applied. If you supply an empty string, Vertica applies the change across all cluster nodes.
location-label
- The label to assign to the specified storage location.
If you supply an empty string, Vertica removes that storage location's label.
You can remove a location label only if the following conditions are both true:
-
No database object has a storage policy that specifies this label.
-
The labeled location is not the last available storage for the objects associated with it.
-
Privileges
Superuser
Examples
The following ALTER_LOCATION_LABEL statement applies across all cluster nodes the label SSD
to the storage location /home/dbadmin/SSD/tables
:
=> SELECT ALTER_LOCATION_LABEL('/home/dbadmin/SSD/tables','', 'SSD');
ALTER_LOCATION_LABEL
---------------------------------------
/home/dbadmin/SSD/tables label changed.
(1 row)
See also
15.2 - ALTER_LOCATION_USE
Alters the type of data that a storage location holds.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
ALTER_LOCATION_USE ( 'path' , '[node]' , 'usage' )
Arguments
path
- Where the storage location is mounted.
node
- The Vertica node on which to alter the storage location. To alter the location on all cluster nodes in a single transaction, use an empty string (
''
). If the usage is SHARED TEMP or SHARED USER, you must alter it on all nodes. usage
- One of the following:
-
DATA
: The storage location stores only data files. -
TEMP
: The location stores only temporary files that are created during loads or queries. -
DATA,TEMP
: The location can store both types of files.
-
Privileges
Superuser
Restrictions
You cannot change a storage location from a USER usage type if you created the location that way, or to a USER type if you did not. You can change a USER storage location to specify DATA (storing TEMP files is not supported). However, doing so does not affect the primary objective of a USER storage location, to be accessible by non-dbadmin users with assigned privileges.
You cannot change a storage location from SHARED TEMP or SHARED USER to SHARED DATA or the reverse.
Monitoring storage locations
For information about the disk storage used on each node, query the
DISK_STORAGE
system table.
Examples
The following example alters a storage location across all cluster nodes to store only data:
=> SELECT ALTER_LOCATION_USE ('/thirdSL/' , '' , 'DATA');
See also
15.3 - CLEAR_CACHES
Clears the Vertica internal cache files.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CLEAR_CACHES ( )
Privileges
Superuser
Notes
If you want to run benchmark tests for your queries, in addition to clearing the internal Vertica cache files, clear the Linux file system cache. The kernel uses unallocated memory as a cache to hold clean disk blocks. If you are running version 2.6.16 or later of Linux and you have root access, you can clear the kernel file system cache as follows:
-
Make sure that all data in the cache is written to disk:
# sync
-
Writing to the
drop_caches
file causes the kernel to drop clean caches, entries, and inodes from memory, causing that memory to become free, as follows:-
To clear the page cache:
# echo 1 > /proc/sys/vm/drop_caches
-
To clear the entries and inodes:
# echo 2 > /proc/sys/vm/drop_caches
-
To clear the page cache, entries, and inodes:
# echo 3 > /proc/sys/vm/drop_caches
-
Examples
The following example clears the Vertica internal cache files:
=> SELECT CLEAR_CACHES();
CLEAR_CACHES
--------------
Cleared
(1 row)
15.4 - CLEAR_OBJECT_STORAGE_POLICY
Removes a user-defined storage policy from the specified database, schema or table. Storage containers at the previous policy's labeled location are moved to the default location. By default, this move occurs after all pending mergeout tasks return.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CLEAR_OBJECT_STORAGE_POLICY ( 'object-name' [,'key-min', 'key-max'] [, 'enforce-storage-move' ] )
Parameters
object-name
- The object to clear, one of the following:
-
database
: Clearsdatabase
of its storage policy. -
[
database
.]
schema
: Clearsschema
of its storage policy. -
[[
database
.]
schema
.]
table
: Clearstable
of its storage policy. Iftable
is in any schema other thanpublic
, you must supply the schema name.
In all cases,
database
must be the name of the current database. -
key-min
key-max
- Valid only if
object-name
is a table, specifies the range of table partition key values stored at the labeled location. enforce-storage-move
- Specifies when the Tuple Mover moves all existing storage containers for the specified object to its default storage location:
-
false
(default): Move storage containers only after all pending mergeout tasks return. -
true
: Immediately move all storage containers to the new location.
Tip
You can also enforce all storage policies immediately by calling Vertica meta-functionENFORCE_OBJECT_STORAGE_POLICY
. -
Privileges
Superuser
Examples
This following statement clears the storage policy for table store.store_orders_fact
. The true
argument specifies to implement the move immediately:
=> SELECT CLEAR_OBJECT_STORAGE_POLICY ('store.store_orders_fact', 'true');
CLEAR_OBJECT_STORAGE_POLICY
-----------------------------------------------------------------------------
Object storage policy cleared.
Task: moving storages
(Table: store.store_orders_fact) (Projection: store.store_orders_fact_b0)
(Table: store.store_orders_fact) (Projection: store.store_orders_fact_b1)
(1 row)
See also
15.5 - DO_TM_TASK
Runs a Tuple Mover (TM) operation and commits current transactions. You can limit this operation to a specific table or projection. When started using this function, the TM uses the GENERAL resource pool instead of the TM resource pool.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
DO_TM_TASK('task'[, '[[database.]schema.]{ table | projection}]' )
Parameters
task
- Specifies one of the following tuple mover operations:
-
mergeout
: Consolidates ROS containers and purges deleted records. For details, seeMergeout. -
reshardmergeout
: Realigns storage containers to the shard definitions created by a RESHARD_DATABASE call. Specify a table or projection and a range of partition values to limit the scope of thereshardmergeout
operations. -
analyze_row_count
: Collects a minimal set of statistics and aggregate row counts for the specified projections, and saves it in the database catalog. Collects the number of rows in the specified projection. If you specify a table name, DO_TM_TASK returns the row counts for all projections of that table. For details, see Analyzing row counts. -
update_storage_catalog
(recommended only for Eon Mode): Updates the catalog with metadata on bundled table data. For details, see Writing bundle metadata to the catalog.
-
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.table
|
projection
- Applies
task
to the specified table or projection. If you specify a projection and it is not found, DO_TM_TASK looks for a table with that name and, if found, applies the task to it and all projections associated with it.If you specify no table or projection, the task is applied to all database tables and their projections.
Privileges
-
Schema: USAGE
-
Table: One of INSERT, UPDATE, or DELETE
Examples
The following example performs a mergeout on all projections in a table:
=> SELECT DO_TM_TASK('mergeout', 't1');
You can perform a re-shard mergeout task on a range of partitions of a table:
=> SELECT DO_TM_TASK('reshardmergeout', 'store_orders', '2001', '2005');
15.6 - DROP_LOCATION
Permanently removes a retired storage location. This operation cannot be undone. You must first retire a storage location with RETIRE_LOCATION before dropping it; you cannot drop a storage location that is in use.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
DROP_LOCATION ( 'path', 'node' )
Arguments
path
- Where the storage location to drop is mounted.
node
- The Vertica node on which to drop the location. To perform this operation on all nodes, use an empty string (
''
). If the storage location is SHARED, you must perform this operation on all nodes.
Privileges
Superuser
Storage locations with temp and data files
If you use a storage location to store data and then alter it to store only temp files, the location can still contain data files. Vertica does not let you drop a storage location containing data files. You can use the MOVE_RETIRED_LOCATION_DATA function to manually merge out the data files from the storage location, or you can drop partitions. Deleting data files does not work.
Examples
The following example shows how to drop a previously retired storage location on v_vmart_node0003
:
=> SELECT DROP_LOCATION('/data', 'v_vmart_node0003');
See also
15.7 - ENFORCE_OBJECT_STORAGE_POLICY
Enterprise Mode only
Applies storage policies of the specified object immediately. By default, the Tuple Mover enforces object storage policies after all pending mergeout operations are complete. Calling this function is equivalent to setting the enforce
argument when using RETIRE_LOCATION. You typically use this function as the last step before dropping a storage location.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
ENFORCE_OBJECT_STORAGE_POLICY ( 'object-name' [,'key-min', 'key-max'] )
Arguments
object-name
- The database object whose storage policies are to be applied, one of the following:
-
database
: Appliesdatabase
storage policies. -
[
database
.]
schema
: Appliesschema
storage policies. -
[[
database
.]
schema
.]
table
: Appliestable
storage policies. Iftable
is in any schema other thanpublic
, you must supply the schema name.
In all cases,
database
must be the name of the current database. -
key-min
,key-max
- Valid only if
object-name
is a table, specifies the range of table partition key values on which to perform the move.
Privileges
One of the following:
-
Superuser
-
Object owner and access to its storage location.
Examples
Apply storage policy updates to the test
table:
=> SELECT ENFORCE_OBJECT_STORAGE_POLICY ('test');
See also
15.8 - MEASURE_LOCATION_PERFORMANCE
Measures a storage location's disk performance.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
MEASURE_LOCATION_PERFORMANCE ( 'path', 'node' )
Parameters
path
- Specifies where the storage location to measure is mounted.
node
- The Vertica node where the location to be measured is available. To obtain a list of all node names on the cluster, query system table DISK_STORAGE.
Privileges
Superuser
Notes
-
If you intend to create a tiered disk architecture in which projections, columns, and partitions are stored on different disks based on predicted or measured access patterns, you need to measure storage location performance for each location in which data is stored. You do not need to measure storage location performance for temp data storage locations because temporary files are stored based on available space.
-
The method of measuring storage location performance applies only to configured clusters. If you want to measure a disk before configuring a cluster see Measuring storage performance.
-
Storage location performance equates to the amount of time it takes to read and write 1MB of data from the disk. This time equates to:
IO-time = (time-to-read-write-1MB + time-to-seek) = (1/throughput + 1/latency)
Throughput is the average throughput of sequential reads/writes (units in MB per second).
Latency is for random reads only in seeks (units in seeks per second)
Note
The IO time of a faster storage location is less than a slower storage location.
Examples
The following example measures the performance of a storage location on v_vmartdb_node0004:
=> SELECT MEASURE_LOCATION_PERFORMANCE('/secondVerticaStorageLocation/' , 'v_vmartdb_node0004');
WARNING: measure_location_performance can take a long time. Please check logs for progress
measure_location_performance
--------------------------------------------------
Throughput : 122 MB/sec. Latency : 140 seeks/sec
See also
15.9 - MOVE_RETIRED_LOCATION_DATA
Moves all data from the specified retired storage location or from all retired storage locations in the database. MOVE_RETIRED_LOCATION_DATA
migrates the data to non-retired storage locations according to the storage policies of the objects whose data is stored in the location. This function returns only after it completes migration of all affected storage location data.
Note
The Tuple Mover migrates data of retired storage locations when it consolidates data into larger ROS containers.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
MOVE_RETIRED_LOCATION_DATA( ['location-path'] [, 'node'] )
Arguments
location-path
- The path of the storage location as specified in the
LOCATION_PATH
column of system tableSTORAGE_LOCATIONS
. This storage location must be marked as retired.If you omit this argument,
MOVE_RETIRED_LOCATION_DATA
moves data from all retired storage locations. node
- The node on which to move data of the retired storage location. If
location-path
is undefined onnode
, this function returns an error.If you omit this argument,
MOVE_RETIRED_LOCATION_DATA
moves data from*location-path
* on all nodes.
Privileges
Superuser
Examples
-
Query system table
STORAGE_LOCATIONS
to show which storage locations are retired:=> SELECT node_name, location_path, location_label, is_retired FROM STORAGE_LOCATIONS WHERE is_retired = 't'; node_name | location_path | location_label | is_retired ------------------+----------------------+----------------+------------ v_vmart_node0001 | /home/dbadmin/SSDLoc | ssd | t v_vmart_node0002 | /home/dbadmin/SSDLoc | ssd | t v_vmart_node0003 | /home/dbadmin/SSDLoc | ssd | t (3 rows)
-
Query system table
STORAGE_LOCATIONS
for the location of the messages table, which is currently stored in retired storage locationssd
:=> SELECT node_name, total_row_count, location_label FROM STORAGE_CONTAINERS WHERE projection_name ILIKE 'messages%'; node_name | total_row_count | location_label ------------------+-----------------+---------------- v_vmart_node0001 | 333514 | ssd v_vmart_node0001 | 333255 | ssd v_vmart_node0002 | 333255 | ssd v_vmart_node0002 | 333231 | ssd v_vmart_node0003 | 333231 | ssd v_vmart_node0003 | 333514 | ssd (6 rows)
-
Call
MOVE_RETIRED_LOCATION_DATA
to move the data off thessd
storage location.=> SELECT MOVE_RETIRED_LOCATION_DATA('/home/dbadmin/SSDLoc'); MOVE_RETIRED_LOCATION_DATA ----------------------------------------------- Move data off retired storage locations done (1 row)
-
Repeat the previous query to verify the storage location of the messages table:
=> SELECT node_name, total_row_count, storage_type, location_label FROM storage_containers WHERE projection_name ILIKE 'messages%'; node_name | total_row_count | location_label ------------------+-----------------+---------------- v_vmart_node0001 | 333255 | base v_vmart_node0001 | 333514 | base v_vmart_node0003 | 333514 | base v_vmart_node0003 | 333231 | base v_vmart_node0002 | 333231 | base v_vmart_node0002 | 333255 | base (6 rows)
See also
15.10 - RESTORE_LOCATION
Restores a storage location that was previously retired with RETIRE_LOCATION.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
RESTORE_LOCATION ( 'path', 'node' )
Arguments
path
- Where to mount the retired storage location.
node
- The Vertica node on which to restore the location. To perform this operation on all nodes, use an empty string (
''
). If the storage location is SHARED, you must perform this operation on all nodes.The operation fails if you dropped any locations.
Privileges
Superuser
Effects of restoring a previously retired location
After restoring a storage location, Vertica re-ranks all of the cluster storage locations. It uses the newly restored location to process queries as determined by its rank.
Monitoring storage locations
For information about the disk storage used on each node, query the
DISK_STORAGE
system table.
Examples
Restore a retired storage location on node4
:
=> SELECT RESTORE_LOCATION ('/thirdSL/' , 'v_vmartdb_node0004');
See also
15.11 - RETIRE_LOCATION
Deactivates the specified storage location. To obtain a list of all existing storage locations, query the STORAGE_LOCATIONS system table.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
RETIRE_LOCATION ( 'path', 'node' [, enforce ] )
Arguments
path
- Where the storage location to retire is mounted.
node
- The Vertica node on which to retire the location. To perform this operation on all nodes, use an empty string (
''
). If the storage location is SHARED, you must perform this operation on all nodes. enforce
- If
true
, the location label is set to an empty string and the data is moved elsewhere. The location can then be dropped without errors or warnings. Use this argument to expedite dropping a location.
Privileges
Superuser
Effects of retiring a storage location
RETIRE_LOCATION checks that the location is not the only storage for data and temp files. At least one location must exist on each node to store data and temp files. However, you can store both sorts of files in either the same location or separate locations.
If a location is the last available storage for its associated objects, you can retire it only if you set enforce
to true
.
When you retire a storage location:
-
No new data is stored at the retired location, unless you first restore it using RESTORE_LOCATION.
-
By default, if the storage location being retired contains stored data, the data is not moved. Thus, you cannot drop the storage location. Instead, Vertica removes the stored data through one or more mergeouts. To drop the location immediately after retiring it, set
enforce
to true. -
If the storage location being retired is used only for temp files or you use
enforce
, you can drop the location. See Dropping storage locations and DROP_LOCATION.
Monitoring storage locations
For information about the disk storage used on each node, query the
DISK_STORAGE
system table.
Examples
The following examples show two approaches to retiring a storage location.
You can retire a storage location and its data will be moved out automatically at a future time:
=> SELECT RETIRE_LOCATION ('/data' , 'v_vmartdb_node0004');
You can specify that data in the storage location be moved immediately, so that you can then drop the location without waiting:
=> SELECT RETIRE_LOCATION ('/data' , 'v_vmartdb_node0004', true);
See also
15.12 - SET_LOCATION_PERFORMANCE
Sets disk performance for a storage location.
Note
Before calling this function, call MEASURE_LOCATION_PERFORMANCE to obtain the location's throughput and average latency .This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SET_LOCATION_PERFORMANCE ( 'path', 'node' , 'throughput', 'average-latency')
Parameters
path
- Specifies where the storage location to set is mounted.
node
- Specifies the Vertica node where the location to set is available.
throughput
- Specifies the throughput for the location, set to a value ≥1.
average-latency
- Specifies the average latency for the location, set to a value ≥1.
Privileges
Superuser
Examples
The following example sets the performance of a storage location on node2 to a throughput of 122 megabytes per second and a latency of 140 seeks per second.
=> SELECT SET_LOCATION_PERFORMANCE('/secondVerticaStorageLocation/','node2','122','140');
See also
15.13 - SET_OBJECT_STORAGE_POLICY
Creates or changes the storage policy of a database object by assigning it a labeled storage location. The Tuple Mover uses this location to store new and existing data for this object. If the object already has an active storage policy, calling SET_OBJECT_STORAGE_POLICY
sets this object's default storage to the new labeled location. Existing data for the object is moved to the new location.
Note
You cannot create a storage policy on a USER type storage location.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SET_OBJECT_STORAGE_POLICY (
'[[database.]schema.]object-name', 'location-label'
[,'key-min', 'key-max'] [, 'enforce-storage-move' ] )
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.object-name
- Identifies the database object assigned to a labeled storage location. The
object-name
can resolve to a database, schema, or table. location-label
- The label of
object-name
's storage location. key-min
key-max
- Valid only if
object-name
is a table, specifies the range of table partition key values to store at the labeled location. enforce-storage-move
- Specifies when the Tuple Mover moves all existing storage containers for
object-name
to the labeled storage location:-
false
(default): Move storage containers only after all pending mergeout tasks return. -
true
: Immediately move all storage containers to the new location.
Tip
You can also enforce all storage policies immediately by calling Vertica meta-functionENFORCE_OBJECT_STORAGE_POLICY
-
Privileges
One of the following:
-
Superuser
-
Object owner and access to its storage location.
Examples
See also
16 - Table functions
This section contains functions for managing tables and constraints.
See also the V_CATALOG.TABLE_CONSTRAINTS system table.
16.1 - ANALYZE_CONSTRAINTS
Analyzes and reports on constraint violations within the specified scope
You can enable automatic enforcement of primary key, unique key, and check constraints when INSERT
, UPDATE
, MERGE
, or COPY
statements execute. Alternatively, you can use ANALYZE_CONSTRAINTS
to validate constraints after issuing these statements. Refer to Constraint enforcement for more information.
ANALYZE_CONSTRAINTS
performs a lock in the same way that SELECT * FROM t1
holds a lock on table t1
. See
LOCKS
for additional information.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
ANALYZE_CONSTRAINTS ('[[[database.]schema.]table ]' [, 'column[,...]'] )
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.table
- Identifies the table to analyze. If you omit specifying a schema, Vertica uses the current schema search path. If set to an empty string, Vertica analyzes all tables in the current schema.
column
- The column in
table
to analyze. You can specify multiple comma-delimited columns. Vertica narrows the scope of the analysis to the specified columns. If you omit specifying a column, Vertica analyzes all columns intable
.
Privileges
-
Schema: USAGE
-
Table: SELECT
Detecting constraint violations during a load process
Vertica checks for constraint violations when queries are run, not when data is loaded. To detect constraint violations as part of the load process, use a COPY statement with the NO COMMIT option. By loading data without committing it, you can run a post-load check of your data using the ANALYZE_CONSTRAINTS
function. If the function finds constraint violations, you can roll back the load because you have not committed it.
If ANALYZE_CONSTRAINTS
finds violations, such as when you insert a duplicate value into a primary key, you can correct errors using the following functions. Effects last until the end of the session only:
Important
If a check constraint SQL expression evaluates to an unknown for a given row because a column within the expression contains a null, the row passes the constraint condition.Return values
ANALYZE_CONSTRAINTS
returns results in a structured set (see table below) that lists the schema name, table name, column name, constraint name, constraint type, and the column values that caused the violation.
If the result set is empty, then no constraint violations exist; for example:
> SELECT ANALYZE_CONSTRAINTS ('public.product_dimension', 'product_key');
Schema Name | Table Name | Column Names | Constraint Name | Constraint Type | Column Values
-------------+------------+--------------+-----------------+-----------------+---------------
(0 rows)
The following result set shows a primary key violation, along with the value that caused the violation ('10')
:
=> SELECT ANALYZE_CONSTRAINTS ('');
Schema Name | Table Name | Column Names | Constraint Name | Constraint Type | Column Values
-------------+------------+--------------+-----------------+-----------------+---------------
store t1 c1 pk_t1 PRIMARY ('10')
(1 row)
The result set columns are described in further detail in the following table:
Column Name | Data Type | Description |
---|---|---|
Schema Name |
VARCHAR | The name of the schema. |
Table Name |
VARCHAR | The name of the table, if specified. |
Column Names |
VARCHAR | A list of comma-delimited columns that contain constraints. |
Constraint Name |
VARCHAR | The given name of the primary key, foreign key, unique, check, or not null constraint, if specified. |
Constraint Type |
VARCHAR |
Identified by one of the following strings:
|
Column Values |
VARCHAR |
Value of the constraint column, in the same order in which When interpreted as SQL, the value of this column forms a list of values of the same type as the columns in
|
Examples
16.2 - ANALYZE_CORRELATIONS
Deprecated
This function is deprecated and will be removed in a future release.Analyzes the specified tables for pairs of columns that are strongly correlated. ANALYZE_CORRELATIONS stores the 20 pairs with the strongest correlation. ANALYZE_CORRELATIONS also analyzes statistics.
ANALYZE_CORRELATIONS analyzes only pairwise single-column correlations.
For example, state name and country name columns are strongly correlated because the city name usually, but perhaps not always, identifies the state name. The city of Conshohoken is uniquely associated with Pennsylvania, while the city of Boston exists in Georgia, Indiana, Kentucky, New York, Virginia, and Massachusetts. In this case, city name is strongly correlated with state name.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
StableSyntax
ANALYZE_CORRELATIONS ('[[[database.]schema.]table ]' [, 'recalculate'] )
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.table-name
- Identifies the table to analyze. If you omit specifying a schema, Vertica uses the current schema search path. If set to an empty string, Vertica analyzes all tables in the current schema.
recalculate
- Boolean that specifies whether to analyze correlated columns that were previously analyzed.
Note
Column correlation analysis typically needs to be done only once.Default:
false
Privileges
One of the following:
- Superuser
-
User with USAGE privilege on the design schema
Examples
In the following example, ANALYZE_CORRELATIONS analyzes column correlations for all tables in the public
schema, even if they currently exist:
=> SELECT ANALYZE_CORRELATIONS ('public.*', 'true');
ANALYZE_CORRELATIONS
----------------------
0
(1 row)
16.3 - COPY_TABLE
Copies one table to another. This lightweight, in-memory function copies the DDL and all user-created projections from the source table. Projection statistics for the source table are also copied. Thus, the source and target tables initially have identical definitions and share the same storage.
Note
Although they share storage space, Vertica regards the tables as discrete objects for license capacity purposes. For example, a single-terabyte table and its copy initially consume only one TB of space. However, your Vertica license regards them as separate objects that consume two TB of space.After the copy operation is complete, the source and copy tables are independent of each other, so you can perform DML operations on one table without impacting the other. These operations can increase the overall storage required for both tables.
Caution
If you create multiple copies of the same table concurrently, one or more of the copy operations is liable to fail. Instead, copy tables sequentially.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
COPY_TABLE (
'[[database.]schema.]source-table',
'[[database.]schema.]target-table'
)
Parameters
-
`[database.]schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.source-table
- The source table to copy. Vertica copies all data from this table to the target table.
target-table
- The target table of the source table. If the target table already exists, Vertica appends the source to the existing table.
If the table does not exist, Vertica creates a table from the source table's definition, by calling
CREATE TABLE
withLIKE
andINCLUDING PROJECTIONS
clause. The new table inherits ownership from the source table. For details, see Replicating a table.
Privileges
Non-superuser:
-
Source table: SELECT
-
Target schema/table (new): CREATE
-
Target table (existing): INSERT
Table attribute requirements
The following attributes of both tables must be identical:
-
Column definitions, including NULL/NOT NULL constraints
-
Segmentation
-
Partitioning expression
-
Number of projections
-
Projection sort order
-
Primary and unique key constraints. However, the key constraints do not have to be identically enabled.
Note
If the target table has primary or unique key constraints enabled and moving the partitions will insert duplicate key values into the target table, Vertica rolls back the operation. Enforcing constraints requires disk reads and can slow the copy process. -
Number and definitions of text indices.
-
If the destination table already exists, the source and destination tables must have identical access policies.
Additionally, If access policies exist on the source table, the following must be true:
-
Access policies on both tables must be identical.
-
One of the following must be true:
-
The executing user owns the source table.
-
AccessPolicyManagementSuperuserOnly
is set to true. See Managing access policies for details.
-
Table restrictions
The following restrictions apply to the source and target tables:
- If the source and target partitions are in different storage tiers, Vertica returns a warning but the operation proceeds. The partitions remain in their existing storage tier.
- If the source table contains a sequence, Vertica converts the sequence to an integer before copying it to the target table. If the target table contains IDENTITY or named sequence columns, Vertica cancels the copy and displays an error message.
- The following tables cannot be used as sources or targets:
-
Temporary tables
-
Virtual tables
-
System tables
-
External tables
-
Examples
If you call COPY_TABLE and the target table does not exist, the function creates the table automatically. In the following example, COPY_TABLE creates the target table public.newtable
. Vertica also copies all the constraints associated with the source table public.product_dimension
except foreign key constraints:
=> SELECT COPY_TABLE ( 'public.product_dimension', 'public.newtable');
-[ RECORD 1 ]--------------------------------------------------
copy_table | Created table public.newtable.
Copied table public.product_dimension to public.newtable
See also
Creating a table from other tables16.4 - DISABLE_DUPLICATE_KEY_ERROR
Disables error messaging when Vertica finds duplicate primary or unique key values at run time (for use with key constraints that are not automatically enabled). Queries execute as though no constraints are defined on the schema. Effects are session scoped.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
DISABLE_DUPLICATE_KEY_ERROR();
Privileges
Superuser
Examples
When you call DISABLE_DUPLICATE_KEY_ERROR
, Vertica issues warnings letting you know that duplicate values will be ignored, and incorrect results are possible. DISABLE_DUPLICATE_KEY_ERROR
is for use only for key constraints that are not automatically enabled.
=> select DISABLE_DUPLICATE_KEY_ERROR();
WARNING 3152: Duplicate values in columns marked as UNIQUE will now be ignored for the remainder of your session or until reenable_duplicate_key_error() is called
WARNING 3539: Incorrect results are possible. Please contact Vertica Support if unsure
disable_duplicate_key_error
------------------------------
Duplicate key error disabled
(1 row)
See also
ANALYZE_CONSTRAINTS
16.5 - INFER_EXTERNAL_TABLE_DDL
Deprecated
This function is deprecated and will be removed in a future release. Instead, use INFER_TABLE_DDL.Inspects a file in Parquet, ORC, or Avro format and returns a CREATE EXTERNAL TABLE AS COPY statement that can be used to read the file. This statement might be incomplete. It could also contain more columns or columns with longer names than what Vertica supports; this function does not enforce Vertica system limits. Always inspect the output and address any issues before using it to create a table.
This function supports partition columns for the Parquet, ORC, and Avro formats, inferred from the input path. Because partitioning is done through the directory structure, there might not be enough information to infer the type of partition columns. In this case, this function shows these columns with a data type of UNKNOWN and emits a warning.
The function handles most data types, including complex types. If an input type is not supported in Vertica, the function emits a warning.
By default, the function uses strong typing for complex types. You can instead treat the column as a flexible complex type by setting the vertica_type_for_complex_type
parameter to LONG VARBINARY.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
INFER_EXTERNAL_TABLE_DDL( path USING PARAMETERS param=value[,...] )
Arguments
path
- Path to a file or directory. Any path that is valid for COPY and uses a file format supported by this function is valid.
Parameters
format
- Input format (string), one of 'Parquet', 'ORC', or 'Avro'. This parameter is required.
table_name
- The name of the external table to create. This parameter is required.
Do not include a schema name as part of the table name; use the
table_schema
parameter. table_schema
- The schema in which to create the external table. If omitted, the function does not include a schema in the output.
vertica_type_for_complex_type
- Type used to represent all columns of complex types, if you do not want to expand them fully. The only supported value is LONG VARBINARY. For more information, see Flexible complex types.
Privileges
Non-superuser: READ privileges on the USER-accessible storage location.
Examples
In the following example, the input file contains data for a table with two integer columns. The table definition can be fully inferred, and you can use the returned SQL statement as-is.
=> SELECT INFER_EXTERNAL_TABLE_DDL('/data/orders/*.orc'
USING PARAMETERS format = 'orc', table_name = 'orders');
INFER_EXTERNAL_TABLE_DDL
--------------------------------------------------------------------------------------------------
create external table "orders" (
"id" int,
"quantity" int
) as copy from '/data/orders/*.orc' orc;
(1 row)
To create a table in a schema, use the table_schema
parameter. Do not add it to the table name; the function treats it as a name with a period in it, not a schema.
The following example shows output with complex types. You can use the definition as-is or modify the VARCHAR sizes:
=> SELECT INFER_EXTERNAL_TABLE_DDL('/data/people/*.parquet'
USING PARAMETERS format = 'parquet', table_name = 'employees');
WARNING 9311: This generated statement contains one or more varchar/varbinary columns which default to length 80
INFER_EXTERNAL_TABLE_DDL
-------------------------------------------------------------------------
create external table "employees"(
"employeeID" int,
"personal" Row(
"name" varchar,
"address" Row(
"street" varchar,
"city" varchar,
"zipcode" int
),
"taxID" int
),
"department" varchar
) as copy from '/data/people/*.parquet' parquet;
(1 row)
In the following example, the input file contains a map in the "prods" column. You can read a map as an array of rows:
=> SELECT INFER_EXTERNAL_TABLE_DDL('/data/orders.parquet'
USING PARAMETERS format='parquet', table_name='orders');
WARNING 9311: This generated statement contains one or more varchar/varbinary columns which default to length 80
INFER_EXTERNAL_TABLE_DDL
------------------------------------------------------------------------
create external table "orders"(
"orderkey" int,
"custkey" int,
"prods" Array[Row(
"key" varchar,
"value" numeric(12,2)
)],
"orderdate" date
) as copy from '/data/orders.parquet' parquet;
(1 row)
In the following example, the data is partitioned by region. The function was not able to infer the data type and reports UNKNOWN:
=> SELECT INFER_EXTERNAL_TABLE_DDL('/data/sales/*/*
USING PARAMETERS format = 'parquet', table_name = 'sales');
WARNING 9262: This generated statement is incomplete because of one or more unknown column types.
Fix these data types before creating the table
INFER_EXTERNAL_TABLE_DDL
------------------------------------------------------------------------
create external table "sales"(
"tx_id" int,
"date" date,
"region" UNKNOWN
) as copy from '/data/sales/*/*' PARTITION COLUMNS region parquet;
(1 row)
For VARCHAR and VARBINARY columns, this function does not specify a length. The Vertica default length for these types is 80 bytes. If the data values are longer, using this table definition unmodified could cause data to be truncated. Always review VARCHAR and VARBINARY columns to determine if you need to specify a length. This function emits a warning if the input file contains columns of these types:
WARNING 9311: This generated statement contains one or more varchar/varbinary columns which default to length 80
16.6 - INFER_TABLE_DDL
Inspects a file in Parquet, ORC, JSON, or Avro format and returns a CREATE TABLE or CREATE EXTERNAL TABLE statement based on its contents.
The returned statement might be incomplete if the input data contains ambiguous or unknown data types. It could also contain more columns or columns with longer names than what Vertica supports; this function does not enforce Vertica system limits. Always inspect the output and address any issues before using it to create a table.
This function supports partition columns, inferred from the input path. Because partitioning is done through the directory structure, there might not be enough information to infer the type of partition columns. In this case, this function shows these columns with a data type of UNKNOWN and emits a warning.
The function handles most data types, including complex types. If an input type is not supported in Vertica, the function emits a warning.
For VARCHAR and VARBINARY columns, this function does not specify a length. The Vertica default length for these types is 80 bytes. If the data values are longer, using the returned table definition unmodified could cause data to be truncated. Always review VARCHAR and VARBINARY columns to determine if you need to specify a length. This function emits a warning if the input file contains columns of these types:
WARNING 9311: This generated statement contains one or more varchar/varbinary columns which default to length 80
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
INFER_TABLE_DDL( path USING PARAMETERS param=value[,...] )
Arguments
path
- Path to a file or glob. Any path that is valid for COPY and uses a file format supported by this function is valid. For all formats except JSON, if a glob specifies more than one file, this function reads a single, arbitrarily-chosen file. For JSON, the function might read more than one file. See JSON.
Parameters
format
- Input format (string), one of 'Parquet', 'ORC', 'Avro', or 'JSON'. This parameter is required.
table_name
- The name of the table to create. This parameter is required.
Do not include a schema name as part of the table name; use the
table_schema
parameter. table_schema
- The schema in which to create the table. If omitted, the function does not include a schema in the output.
table_type
- The type of table to create, either 'native' or 'external'.
Default: 'native'
with_copy_statement
- For native tables, whether to include a COPY statement in addition to the CREATE TABLE statement.
Default: false
one_line_result
- Whether to return the DDL as a single line instead of pretty-printing. The single-line format might be easier to copy into SQL scripts.
Default: false (pretty-print)
max_files
- (JSON only.) Maximum number of files in
path
to inspect, ifpath
is a glob. Use this parameter to increase the amount of data the function considers, for example if you suspect variation among files. Files are chosen arbitrarily from the glob. For details, see JSON.Default: 1
max_candidates
- (JSON only.) Number of candidate table definitions to show. The function generates only one candidate per file, so if you increase
max_candidates
, also increasemax_files
. For details, see JSON.Default: 1
Privileges
Non-superuser: READ privileges on the USER-accessible storage location.
JSON
JSON, unlike the other supported formats, does not embed a schema in data files. This function infers JSON table DDL by instead inspecting the raw data. Because raw data can be ambiguous or inconsistent, the function takes a different approach for this format.
For each input file, the function iterates through records to develop a candidate table definition. A top-level field that appears in any record is included as a column, even if not all records use it. If the same field appears in the file with different types, the function chooses a type that is consistent with all observed occurrences.
Consider a file with data about restaurants:
{
"name" : "Pizza House",
"cuisine" : "Italian",
"location_city" : [],
"chain" : true,
"hours" : [],
"menu" : [{"item" : "cheese pizza", "price" : 7.99},
{"item" : "spinach pizza", "price" : 8.99},
{"item" : "garlic bread", "price" : 4.99}]
}
{
"name" : "Sushi World",
"cuisine" : "Asian",
"location_city" : ["Pittsburgh"],
"chain" : false,
"menu" : [{"item" : "maki platter", "price" : "21.95"},
{"item" : "tuna roll", "price" : "4.95"}]
}
The first record contains two empty arrays, so there is not enough information to determine the element types. The second record has a string value for one of them, so the function can infer a type of VARCHAR for it. The other array element type remains unknown.
In the first record menu prices are numbers, but in the second they are strings. Both FLOAT and the string can be coerced to NUMERIC, so the function returns NUMERIC:
=> SELECT INFER_TABLE_DDL ('/data/restaurants.json'
USING PARAMETERS table_name='restaurants', format='json');
WARNING 0: This generated statement contains one or more varchar/varbinary types which default to length 80
INFER_TABLE_DDL
------------------------------------------------------------------------
Candidate matched 1 out of 1 total files:
create table "restaurants"(
"chain" bool,
"cuisine" varchar,
"hours" Array[UNKNWON],
"location_city" Array[varchar],
"menu" Array[Row(
"item" varchar,
"price" numeric
)],
"name" varchar
);
(1 row)
All scalar types can be coerced to VARCHAR, so if a conflict cannot be resolved more specifically (as in the NUMERIC example), the function can still return a type. Complex types, however, cannot always be resolved in this way. In the following example, records in a file have conflicting definitions of the hours
field:
{
"name" : "Sushi World",
"cuisine" : "Asian",
"location_city" : ["Pittsburgh"],
"chain" : false,
"hours" : {"open" : "11:00", "close" : "22:00" }
}
{
"name" : "Greasy Spoon",
"cuisine" : "American",
"location_city" : [],
"chain" : "false",
"hours" : {"open" : ["11:00","12:00"], "close" : ["21:00","22:00"] },
}
In the first record the value is a ROW with two TIME fields. In the second record the value is a ROW with two ARRAY[TIME] fields (representing weekday and weekend hours). These types are incompatible, so the function suggests a flexible complex type by using LONG VARBINARY:
=> SELECT INFER_TABLE_DDL ('/data/restaurants.json'
USING PARAMETERS table_name='restaurants', format='json');
WARNING 0: This generated statement contains one or more varchar/varbinary types which default to length 80
INFER_TABLE_DDL
------------------------------------------------------------------------
Candidate matched 1 out of 1 total files:
create table "restaurants"(
"chain" bool,
"cuisine" varchar,
"hours" long varbinary,
"location_city" Array[varchar],
"name" varchar
);
(1 row)
If you call the function with a glob, by default it reads one file. Set max_files
to a higher number to inspect more data. The function calculates one candidate table definition per file and returns the definition that covers the largest number of files.
Increasing the number of files does not, by itself, increase the number of candidates the function returns. With more files the function can consider more candidates, but by default it returns the single candidate that represents the largest number of files. To see more than one possible table definition, also set max_candidates
. There is no benefit to setting max_candidates
to a larger number than max_files
.
In the following example, the glob contains two files that differ in the structure of the menu column. In the first file, the menu field has two fields:
{
"name" : "Bob's pizzeria",
"cuisine" : "Italian",
"location_city" : ["Cambridge", "Pittsburgh"],
"menu" : [{"item" : "cheese pizza", "price" : 8.25},
{"item" : "spinach pizza", "price" : 10.50}]
}
In the second file, the menu has different offerings at different times of day:
{
"name" : "Greasy Spoon",
"cuisine" : "American",
"location_city" : [],
"menu" : [{"time" : "breakfast",
"items" :
[{"item" : "scrambled eggs", "price" : "3.99"}]
},
{"time" : "lunch",
"items" :
[{"item" : "grilled cheese", "price" : "3.95"},
{"item" : "tuna melt", "price" : "5.95"},
{"item" : "french fries", "price" : "1.99"}]}]
}
To see both candidates, raise both max_files
and max_candidates
:
=> SELECT INFER_TABLE_DDL ('/data/*.json'
USING PARAMETERS table_name='restaurants', format='json',
max_files=3, max_candidates=3);
WARNING 0: This generated statement contains one or more float types which might lose precision
WARNING 0: This generated statement contains one or more varchar/varbinary types which default to length 80
INFER_TABLE_DDL
------------------------------------------------------------------------
Candidate matched 1 out of 2 total files:
create table "restaurants"(
"cuisine" varchar,
"location_city" Array[varchar],
"menu" Array[Row(
"item" varchar,
"price" float
)],
"name" varchar
);
Candidate matched 1 out of 2 total files:
create table "restaurants"(
"cuisine" varchar,
"location_city" Array[varchar],
"menu" Array[Row(
"items" Array[Row(
"item" varchar,
"price" numeric
)],
"time" varchar
)],
"name" varchar
);
(1 row)
Examples
In the following example, the input path contains data for a table with two integer columns. The external table definition can be fully inferred, and you can use the returned SQL statement as-is. The function reads one file from the input path:
=> SELECT INFER_TABLE_DDL('/data/orders/*.orc'
USING PARAMETERS format = 'orc', table_name = 'orders', table_type = 'external');
INFER_TABLE_DDL
------------------------------------------------------------------------
create external table "orders" (
"id" int,
"quantity" int
) as copy from '/data/orders/*.orc' orc;
(1 row)
To create a table in a schema, use the table_schema
parameter. Do not add it to the table name; the function treats it as a name with a period in it, not a schema.
The following example shows output with complex types. You can use the definition as-is or modify the VARCHAR sizes:
=> SELECT INFER_TABLE_DDL('/data/people/*.parquet'
USING PARAMETERS format = 'parquet', table_name = 'employees');
WARNING 9311: This generated statement contains one or more varchar/varbinary columns which default to length 80
INFER_TABLE_DDL
------------------------------------------------------------------------
create table "employees"(
"employeeID" int,
"personal" Row(
"name" varchar,
"address" Row(
"street" varchar,
"city" varchar,
"zipcode" int
),
"taxID" int
),
"department" varchar
);
(1 row)
In the following example, the input file contains a map in the "prods" column. You can read a map as an array of rows:
=> SELECT INFER_TABLE_DDL('/data/orders.parquet'
USING PARAMETERS format='parquet', table_name='orders');
WARNING 9311: This generated statement contains one or more varchar/varbinary columns which default to length 80
INFER_TABLE_DDL
------------------------------------------------------------------------
create table "orders"(
"orderkey" int,
"custkey" int,
"prods" Array[Row(
"key" varchar,
"value" numeric(12,2)
)],
"orderdate" date
);
(1 row)
The following example returns the definition of a native table and the COPY statement, putting the table definition on a single line to simplify cutting and pasting into a script:
=> SELECT INFER_TABLE_DDL('/data/orders/*.orc'
USING PARAMETERS format = 'orc', table_name = 'orders',
table_type = 'native', with_copy_statement = true, one_line_result=true);
INFER_TABLE_DDL
-----------------------------------------------------------------------
create table "orders" ("id" int, "quantity" int);
copy "orders" from '/data/orders/*.orc' orc;
(1 row)
In the following example, the data is partitioned by region. The function was not able to infer the data type and reports UNKNOWN:
=> SELECT INFER_TABLE_DDL('/data/sales/*/*
USING PARAMETERS format = 'orc', table_name = 'sales', table_type = 'external');
WARNING 9262: This generated statement is incomplete because of one or more unknown column types. Fix these data types before creating the table
WARNING 9311: This generated statement contains one or more varchar/varbinary columns which default to length 80
INFER_TABLE_DDL
------------------------------------------------------------------------
create external table "sales"(
"orderkey" int,
"custkey" int,
"prodkey" Array[varchar],
"orderprices" Array[numeric(12,2)],
"orderdate" date,
"region" UNKNOWN
) as copy from '/data/sales/*/*' PARTITION COLUMNS region orc;
(1 row)
In the following example, the function reads multiple JSON files and they differ in how they represent the menu
column:
=> SELECT INFER_TABLE_DDL ('/data/*.json'
USING PARAMETERS table_name='restaurants', format='json',
max_files=3, max_candidates=3);
WARNING 0: This generated statement contains one or more float types which might lose precision
WARNING 0: This generated statement contains one or more varchar/varbinary types which default to length 80
INFER_TABLE_DDL
------------------------------------------------------------------------
Candidate matched 1 out of 2 total files:
create table "restaurants"(
"cuisine" varchar,
"location_city" Array[varchar],
"menu" Array[Row(
"item" varchar,
"price" float
)],
"name" varchar
);
Candidate matched 1 out of 2 total files:
create table "restaurants"(
"cuisine" varchar,
"location_city" Array[varchar],
"menu" Array[Row(
"items" Array[Row(
"item" varchar,
"price" numeric
)],
"time" varchar
)],
"name" varchar
);
(1 row)
16.7 - LAST_INSERT_ID
Returns the last value of an IDENTITY column. If multiple sessions concurrently load the same table with an IDENTITY column, the function returns the last value generated for that column.
Note
This function works only with IDENTITY columns. It does not work with named sequences.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
LAST_INSERT_ID()
Privileges
-
Table owner
-
USAGE privileges on the table schema
Examples
See IDENTITY sequences.
16.8 - PURGE_TABLE
Note
This function was formerly named PURGE_TABLE_PROJECTIONS(). Vertica still supports the former function name.Permanently removes deleted data from physical storage so disk space can be reused. You can purge historical data up to and including the Ancient History Mark epoch.
Purges all projections of the specified table. You cannot use this function to purge temporary tables.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
PURGE_TABLE ( '[[database.]schema.]table' )
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.table
- The table to purge.
Privileges
-
Table owner
-
USAGE privilege on schema
Caution
PURGE_TABLE could temporarily take up significant disk space while the data is being purged.Examples
The following example purges all projections for the store sales fact table located in the Vmart schema:
=> SELECT PURGE_TABLE('store.store_sales_fact');
See also
16.9 - REBALANCE_TABLE
Synchronously rebalances data in the specified table.
A rebalance operation performs the following tasks:
-
Distributes data based on:
-
User-defined fault groups, if specified
-
Large cluster automatic fault groups
-
-
Redistributes database projection data across all nodes.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
REBALANCE_TABLE('[[database.]schema.]table-name')
Parameters
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.table-name
- The table to rebalance.
Privileges
Superuser
When to rebalance
Rebalancing is useful or even necessary after you perform the following tasks:
-
Mark one or more nodes as ephemeral in preparation of removing them from the cluster.
-
Add one or more nodes to the cluster so that Vertica can populate the empty nodes with data.
-
Change the scaling factor of an elastic cluster, which determines the number of storage containers used to store a projection across the database.
-
Set the control node size or realign control nodes on a large cluster layout
-
Add nodes to or remove nodes from a fault group.
Tip
By default, before performing a rebalance, Vertica queries system tables to compute the size of all projections involved in the rebalance task. This query can add significant overhead to the rebalance operation. To disable this query, set projection configuration parameter RebalanceQueryStorageContainers to 0.Examples
The following command shows how to rebalance data on the specified table.
=> SELECT REBALANCE_TABLE('online_sales.online_sales_fact');
REBALANCE_TABLE
-------------------
REBALANCED
(1 row)
See also
16.10 - REENABLE_DUPLICATE_KEY_ERROR
Restores the default behavior of error reporting by reversing the effects of
DISABLE_DUPLICATE_KEY_ERROR
. Effects are session-scoped.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
REENABLE_DUPLICATE_KEY_ERROR();
Privileges
Superuser
Examples
=> SELECT REENABLE_DUPLICATE_KEY_ERROR();
REENABLE_DUPLICATE_KEY_ERROR
------------------------------
Duplicate key error enabled
(1 row)
See also
ANALYZE_CONSTRAINTS