This section contains storage management functions specific to Vertica.
This is the multi-page printable view of this section. Click here to print.
Storage functions
- 1: ALTER_LOCATION_LABEL
- 2: ALTER_LOCATION_USE
- 3: CLEAR_CACHES
- 4: CLEAR_OBJECT_STORAGE_POLICY
- 5: DO_TM_TASK
- 6: DROP_LOCATION
- 7: ENFORCE_OBJECT_STORAGE_POLICY
- 8: MEASURE_LOCATION_PERFORMANCE
- 9: MOVE_RETIRED_LOCATION_DATA
- 10: RESTORE_LOCATION
- 11: RETIRE_LOCATION
- 12: SET_LOCATION_PERFORMANCE
- 13: SET_OBJECT_STORAGE_POLICY
1 - ALTER_LOCATION_LABEL
Adds a label to a storage location, or changes or removes an existing label. You can change a location label if it is not specified by any storage policy.
Caution
If you label a storage location that contains data, Vertica moves the data to an unlabeled location, if one exists. To prevent data movement between storage locations, labels should be applied either to all storage locations or none.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
ALTER_LOCATION_LABEL ( 'path' , '[node]' , '[location-label]' )
Parameters
path
- The storage location path.
node
- The node where the label change is applied. If you supply an empty string, Vertica applies the change across all cluster nodes.
location-label
- The label to assign to the specified storage location.
If you supply an empty string, Vertica removes that storage location's label.
You can remove a location label only if the following conditions are both true:
-
No database object has a storage policy that specifies this label.
-
The labeled location is not the last available storage for the objects associated with it.
-
Privileges
Superuser
Examples
The following ALTER_LOCATION_LABEL statement applies across all cluster nodes the label SSD
to the storage location /home/dbadmin/SSD/tables
:
=> SELECT ALTER_LOCATION_LABEL('/home/dbadmin/SSD/tables','', 'SSD');
ALTER_LOCATION_LABEL
---------------------------------------
/home/dbadmin/SSD/tables label changed.
(1 row)
See also
2 - ALTER_LOCATION_USE
Alters the type of data that a storage location holds.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
ALTER_LOCATION_USE ( 'path' , '[node]' , 'usage' )
Arguments
path
- Where the storage location is mounted.
node
- The Vertica node on which to alter the storage location. To alter the location on all cluster nodes in a single transaction, use an empty string (
''
). If the usage is SHARED TEMP or SHARED USER, you must alter it on all nodes. usage
- One of the following:
-
DATA
: The storage location stores only data files. -
TEMP
: The location stores only temporary files that are created during loads or queries. -
DATA,TEMP
: The location can store both types of files.
-
Privileges
Superuser
Restrictions
You cannot change a storage location from a USER usage type if you created the location that way, or to a USER type if you did not. You can change a USER storage location to specify DATA (storing TEMP files is not supported). However, doing so does not affect the primary objective of a USER storage location, to be accessible by non-dbadmin users with assigned privileges.
You cannot change a storage location from SHARED TEMP or SHARED USER to SHARED DATA or the reverse.
Monitoring storage locations
For information about the disk storage used on each node, query the
DISK_STORAGE
system table.
Examples
The following example alters a storage location across all cluster nodes to store only data:
=> SELECT ALTER_LOCATION_USE ('/thirdSL/' , '' , 'DATA');
See also
3 - CLEAR_CACHES
Clears the Vertica internal cache files.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CLEAR_CACHES ( )
Privileges
Superuser
Notes
If you want to run benchmark tests for your queries, in addition to clearing the internal Vertica cache files, clear the Linux file system cache. The kernel uses unallocated memory as a cache to hold clean disk blocks. If you are running version 2.6.16 or later of Linux and you have root access, you can clear the kernel file system cache as follows:
-
Make sure that all data in the cache is written to disk:
# sync
-
Writing to the
drop_caches
file causes the kernel to drop clean caches, entries, and inodes from memory, causing that memory to become free, as follows:-
To clear the page cache:
# echo 1 > /proc/sys/vm/drop_caches
-
To clear the entries and inodes:
# echo 2 > /proc/sys/vm/drop_caches
-
To clear the page cache, entries, and inodes:
# echo 3 > /proc/sys/vm/drop_caches
-
Examples
The following example clears the Vertica internal cache files:
=> SELECT CLEAR_CACHES();
CLEAR_CACHES
--------------
Cleared
(1 row)
4 - CLEAR_OBJECT_STORAGE_POLICY
Removes a user-defined storage policy from the specified database, schema or table. Storage containers at the previous policy's labeled location are moved to the default location. By default, this move occurs after all pending mergeout tasks return.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CLEAR_OBJECT_STORAGE_POLICY ( 'object-name' [,'key-min', 'key-max'] [, 'enforce-storage-move' ] )
Parameters
object-name
- The object to clear, one of the following:
-
database
: Clearsdatabase
of its storage policy. -
[
database
.]
schema
: Clearsschema
of its storage policy. -
[[
database
.]
schema
.]
table
: Clearstable
of its storage policy. Iftable
is in any schema other thanpublic
, you must supply the schema name.
In all cases,
database
must be the name of the current database. -
key-min
key-max
- Valid only if
object-name
is a table, specifies the range of table partition key values stored at the labeled location. enforce-storage-move
- Specifies when the Tuple Mover moves all existing storage containers for the specified object to its default storage location:
-
false
(default): Move storage containers only after all pending mergeout tasks return. -
true
: Immediately move all storage containers to the new location.
Tip
You can also enforce all storage policies immediately by calling Vertica meta-functionENFORCE_OBJECT_STORAGE_POLICY
. -
Privileges
Superuser
Examples
This following statement clears the storage policy for table store.store_orders_fact
. The true
argument specifies to implement the move immediately:
=> SELECT CLEAR_OBJECT_STORAGE_POLICY ('store.store_orders_fact', 'true');
CLEAR_OBJECT_STORAGE_POLICY
-----------------------------------------------------------------------------
Object storage policy cleared.
Task: moving storages
(Table: store.store_orders_fact) (Projection: store.store_orders_fact_b0)
(Table: store.store_orders_fact) (Projection: store.store_orders_fact_b1)
(1 row)
See also
5 - DO_TM_TASK
Runs a Tuple Mover (TM) operation and commits current transactions. You can limit this operation to a specific table or projection. When started using this function, the TM uses the GENERAL resource pool instead of the TM resource pool.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
DO_TM_TASK('task'[, '[[database.]schema.]{ table | projection}]' )
Parameters
task
- Specifies one of the following tuple mover operations:
-
mergeout
: Consolidates ROS containers and purges deleted records. For details, seeMergeout. -
reshardmergeout
: Realigns storage containers to the shard definitions created by a RESHARD_DATABASE call. Specify a table or projection and a range of partition values to limit the scope of thereshardmergeout
operations. -
analyze_row_count
: Collects a minimal set of statistics and aggregate row counts for the specified projections, and saves it in the database catalog. Collects the number of rows in the specified projection. If you specify a table name, DO_TM_TASK returns the row counts for all projections of that table. For details, see Analyzing row counts. -
update_storage_catalog
(recommended only for Eon Mode): Updates the catalog with metadata on bundled table data. For details, see Writing bundle metadata to the catalog.
-
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.table
|
projection
- Applies
task
to the specified table or projection. If you specify a projection and it is not found, DO_TM_TASK looks for a table with that name and, if found, applies the task to it and all projections associated with it.If you specify no table or projection, the task is applied to all database tables and their projections.
Privileges
-
Schema: USAGE
-
Table: One of INSERT, UPDATE, or DELETE
Examples
The following example performs a mergeout on all projections in a table:
=> SELECT DO_TM_TASK('mergeout', 't1');
You can perform a reshard mergeout task on a range of partitions of a table:
=> SELECT DO_TM_TASK('reshardmergeout', 'store_orders', '2001', '2005');
6 - DROP_LOCATION
Permanently removes a retired storage location. This operation cannot be undone. You must first retire a storage location with RETIRE_LOCATION before dropping it; you cannot drop a storage location that is in use.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
DROP_LOCATION ( 'path', 'node' )
Arguments
path
- Where the storage location to drop is mounted.
node
- The Vertica node on which to drop the location. To perform this operation on all nodes, use an empty string (
''
). If the storage location is SHARED, you must perform this operation on all nodes.
Privileges
Superuser
Storage locations with temp and data files
If you use a storage location to store data and then alter it to store only temp files, the location can still contain data files. Vertica lets you drop a storage location containing data files only if it is a communal storage location. You can use the MOVE_RETIRED_LOCATION_DATA function to manually merge out the data files from the storage location, or you can drop partitions. Deleting data files does not work. If the dropped location is a communal storage location, its storage containers, if any, are moved to the main communal storage location, which is the storage location set on database creation. You cannot drop the main communal storage location.
Examples
The following example shows how to drop a previously retired storage location on v_vmart_node0003
:
=> SELECT DROP_LOCATION('/data', 'v_vmart_node0003');
See also
7 - ENFORCE_OBJECT_STORAGE_POLICY
Enterprise Mode only
Applies storage policies of the specified object immediately. By default, the Tuple Mover enforces object storage policies after all pending mergeout operations are complete. Calling this function is equivalent to setting the enforce
argument when using RETIRE_LOCATION. You typically use this function as the last step before dropping a storage location.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
ENFORCE_OBJECT_STORAGE_POLICY ( 'object-name' [,'key-min', 'key-max'] )
Arguments
object-name
- The database object whose storage policies are to be applied, one of the following:
-
database
: Appliesdatabase
storage policies. -
[
database
.]
schema
: Appliesschema
storage policies. -
[[
database
.]
schema
.]
table
: Appliestable
storage policies. Iftable
is in any schema other thanpublic
, you must supply the schema name.
In all cases,
database
must be the name of the current database. -
key-min
,key-max
- Valid only if
object-name
is a table, specifies the range of table partition key values on which to perform the move.
Privileges
One of the following:
-
Superuser
-
Object owner and access to its storage location.
Examples
Apply storage policy updates to the test
table:
=> SELECT ENFORCE_OBJECT_STORAGE_POLICY ('test');
See also
8 - MEASURE_LOCATION_PERFORMANCE
Measures a storage location's disk performance.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
MEASURE_LOCATION_PERFORMANCE ( 'path', 'node' )
Parameters
path
- Specifies where the storage location to measure is mounted.
node
- The Vertica node where the location to be measured is available. To obtain a list of all node names on the cluster, query system table DISK_STORAGE.
Privileges
Superuser
Notes
-
If you intend to create a tiered disk architecture in which projections, columns, and partitions are stored on different disks based on predicted or measured access patterns, you need to measure storage location performance for each location in which data is stored. You do not need to measure storage location performance for temp data storage locations because temporary files are stored based on available space.
-
The method of measuring storage location performance applies only to configured clusters. If you want to measure a disk before configuring a cluster see Measuring storage performance.
-
Storage location performance equates to the amount of time it takes to read and write 1MB of data from the disk. This time equates to:
IO-time = (time-to-read-write-1MB + time-to-seek) = (1/throughput + 1/latency)
Throughput is the average throughput of sequential reads/writes (units in MB per second).
Latency is for random reads only in seeks (units in seeks per second)
Note
The IO time of a faster storage location is less than a slower storage location.
Examples
The following example measures the performance of a storage location on v_vmartdb_node0004:
=> SELECT MEASURE_LOCATION_PERFORMANCE('/secondVerticaStorageLocation/' , 'v_vmartdb_node0004');
WARNING: measure_location_performance can take a long time. Please check logs for progress
measure_location_performance
--------------------------------------------------
Throughput : 122 MB/sec. Latency : 140 seeks/sec
See also
9 - MOVE_RETIRED_LOCATION_DATA
Moves all data from the specified retired storage location or from all retired storage locations in the database. MOVE_RETIRED_LOCATION_DATA
migrates the data to non-retired storage locations according to the storage policies of the objects whose data is stored in the location. This function returns only after it completes migration of all affected storage location data.
Note
The Tuple Mover migrates data of retired storage locations when it consolidates data into larger ROS containers.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
MOVE_RETIRED_LOCATION_DATA( ['location-path'] [, 'node'] )
Arguments
location-path
- The path of the storage location as specified in the
LOCATION_PATH
column of system tableSTORAGE_LOCATIONS
. This storage location must be marked as retired.If you omit this argument,
MOVE_RETIRED_LOCATION_DATA
moves data from all retired storage locations. node
- The node on which to move data of the retired storage location. If
location-path
is undefined onnode
, this function returns an error.If you omit this argument,
MOVE_RETIRED_LOCATION_DATA
moves data from*location-path
* on all nodes.
Privileges
Superuser
Examples
-
Query system table
STORAGE_LOCATIONS
to show which storage locations are retired:=> SELECT node_name, location_path, location_label, is_retired FROM STORAGE_LOCATIONS WHERE is_retired = 't'; node_name | location_path | location_label | is_retired ------------------+----------------------+----------------+------------ v_vmart_node0001 | /home/dbadmin/SSDLoc | ssd | t v_vmart_node0002 | /home/dbadmin/SSDLoc | ssd | t v_vmart_node0003 | /home/dbadmin/SSDLoc | ssd | t (3 rows)
-
Query system table
STORAGE_LOCATIONS
for the location of the messages table, which is currently stored in retired storage locationssd
:=> SELECT node_name, total_row_count, location_label FROM STORAGE_CONTAINERS WHERE projection_name ILIKE 'messages%'; node_name | total_row_count | location_label ------------------+-----------------+---------------- v_vmart_node0001 | 333514 | ssd v_vmart_node0001 | 333255 | ssd v_vmart_node0002 | 333255 | ssd v_vmart_node0002 | 333231 | ssd v_vmart_node0003 | 333231 | ssd v_vmart_node0003 | 333514 | ssd (6 rows)
-
Call
MOVE_RETIRED_LOCATION_DATA
to move the data off thessd
storage location.=> SELECT MOVE_RETIRED_LOCATION_DATA('/home/dbadmin/SSDLoc'); MOVE_RETIRED_LOCATION_DATA ----------------------------------------------- Move data off retired storage locations done (1 row)
-
Repeat the previous query to verify the storage location of the messages table:
=> SELECT node_name, total_row_count, storage_type, location_label FROM storage_containers WHERE projection_name ILIKE 'messages%'; node_name | total_row_count | location_label ------------------+-----------------+---------------- v_vmart_node0001 | 333255 | base v_vmart_node0001 | 333514 | base v_vmart_node0003 | 333514 | base v_vmart_node0003 | 333231 | base v_vmart_node0002 | 333231 | base v_vmart_node0002 | 333255 | base (6 rows)
See also
10 - RESTORE_LOCATION
Restores a storage location that was previously retired with RETIRE_LOCATION.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
RESTORE_LOCATION ( 'path', 'node' )
Arguments
path
- Where to mount the retired storage location.
node
- The Vertica node on which to restore the location. To perform this operation on all nodes, use an empty string (
''
). If the storage location is SHARED, you must perform this operation on all nodes.The operation fails if you dropped any locations.
Privileges
Superuser
Effects of restoring a previously retired location
After restoring a storage location, Vertica re-ranks all of the cluster storage locations. It uses the newly restored location to process queries as determined by its rank.
Monitoring storage locations
For information about the disk storage used on each node, query the
DISK_STORAGE
system table.
Examples
Restore a retired storage location on node4
:
=> SELECT RESTORE_LOCATION ('/thirdSL/' , 'v_vmartdb_node0004');
See also
11 - RETIRE_LOCATION
Deactivates the specified storage location. To obtain a list of all existing storage locations, query the STORAGE_LOCATIONS system table.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
RETIRE_LOCATION ( 'path', 'node' [, enforce ] )
Arguments
path
- Where the storage location to retire is mounted.
node
- The Vertica node on which to retire the location. To perform this operation on all nodes, use an empty string (
''
). If the storage location is SHARED, you must perform this operation on all nodes. enforce
- If
true
, the location label is set to an empty string and the data is moved elsewhere. The location can then be dropped without errors or warnings. Use this argument to expedite dropping a location.
Privileges
Superuser
Effects of retiring a storage location
RETIRE_LOCATION checks that the location is not the only storage for data and temp files. At least one location must exist on each node to store data and temp files. However, you can store both sorts of files in either the same location or separate locations.
If a location is the last available storage for its associated objects, you can retire it only if you set enforce
to true
.
When you retire a storage location:
-
No new data is stored at the retired location, unless you first restore it using RESTORE_LOCATION.
-
By default, if the storage location being retired contains stored data, the data is not moved. Thus, you cannot drop the storage location. Instead, Vertica removes the stored data through one or more mergeouts. To drop the location immediately after retiring it, set
enforce
to true. -
If the storage location being retired is used only for temp files or you use
enforce
, you can drop the location. See Dropping storage locations and DROP_LOCATION.
Monitoring storage locations
For information about the disk storage used on each node, query the
DISK_STORAGE
system table.
Examples
The following examples show two approaches to retiring a storage location.
You can retire a storage location and its data will be moved out automatically at a future time:
=> SELECT RETIRE_LOCATION ('/data' , 'v_vmartdb_node0004');
You can specify that data in the storage location be moved immediately, so that you can then drop the location without waiting:
=> SELECT RETIRE_LOCATION ('/data' , 'v_vmartdb_node0004', true);
See also
12 - SET_LOCATION_PERFORMANCE
Sets disk performance for a storage location.
Note
Before calling this function, call MEASURE_LOCATION_PERFORMANCE to obtain the location's throughput and average latency .This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SET_LOCATION_PERFORMANCE ( 'path', 'node' , 'throughput', 'average-latency')
Parameters
path
- Specifies where the storage location to set is mounted.
node
- Specifies the Vertica node where the location to set is available.
throughput
- Specifies the throughput for the location, set to a value ≥1.
average-latency
- Specifies the average latency for the location, set to a value ≥1.
Privileges
Superuser
Examples
The following example sets the performance of a storage location on node2 to a throughput of 122 megabytes per second and a latency of 140 seeks per second.
=> SELECT SET_LOCATION_PERFORMANCE('/secondVerticaStorageLocation/','node2','122','140');
See also
13 - SET_OBJECT_STORAGE_POLICY
Creates or changes the storage policy of a database object by assigning it a labeled storage location. The Tuple Mover uses this location to store new and existing data for this object. If the object already has an active storage policy, calling SET_OBJECT_STORAGE_POLICY
sets this object's default storage to the new labeled location. Existing data for the object is moved to the new location.
Note
You cannot create a storage policy on a USER type storage location.This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SET_OBJECT_STORAGE_POLICY (
'[[database.]schema.]object-name', 'location-label'
[,'key-min', 'key-max'] [, 'enforce-storage-move' ] )
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.object-name
- Identifies the database object assigned to a labeled storage location. The
object-name
can resolve to a database, schema, or table. location-label
- The label of
object-name
's storage location. key-min
key-max
- Valid only if
object-name
is a table, specifies the range of table partition key values to store at the labeled location. enforce-storage-move
- Specifies when the Tuple Mover moves all existing storage containers for
object-name
to the labeled storage location:-
false
(default): Move storage containers only after all pending mergeout tasks return. -
true
: Immediately move all storage containers to the new location.
Tip
You can also enforce all storage policies immediately by calling Vertica meta-functionENFORCE_OBJECT_STORAGE_POLICY
-
Privileges
One of the following:
-
Superuser
-
Object owner and access to its storage location.
Examples
The following example changes the storage policy of table t1
from the communal storage location with label main
to the communal storage location with the label s3
.
To see where the projections for the table t1
are stored before the policy change, you can query the STORAGE_CONTAINERS system table:
=> SELECT node_name, schema_name, projection_name, location_label, shard_name FROM STORAGE_CONTAINERS WHERE projection_name = 't1_super' ORDER BY shard_name, sal_storage_id;
node_name | schema_name | projection_name | location_label | shard_name
-----------+-------------+-----------------+----------------+-------------
e1 | public | t1_super | main | segment0003
initiator | public | t1_super | main | segment0003
e1 | public | t1_super | main | segment0003
initiator | public | t1_super | main | segment0003
e1 | public | t1_super | main | segment0003
initiator | public | t1_super | main | segment0003
e1 | public | t1_super | main | segment0003
initiator | public | t1_super | main | segment0003
(8 rows)
Call SET_OBJECT_STORAGE_POLICY
to change the storage policy for t1
to the communal storage location with the s3
label:
=> SELECT SET_OBJECT_STORAGE_POLICY('t1', 's3');
SET_OBJECT_STORAGE_POLICY
----------------------------
Object storage policy set.
(1 row)
Query the STORAGE_CONTAINERS system table to confirm that the object is now stored in the communal storage location with the label s3
:
=> SELECT node_name, schema_name, projection_name, location_label, shard_name FROM STORAGE_CONTAINERS WHERE projection_name = 't1_super' ORDER BY shard_name, sal_storage_id;
node_name | schema_name | projection_name | location_label | shard_name
-----------+-------------+-----------------+----------------+-------------
e1 | public | t1_super | s3 | segment0003
initiator | public | t1_super | s3 | segment0003
e1 | public | t1_super | s3 | segment0003
initiator | public | t1_super | s3 | segment0003
e1 | public | t1_super | s3 | segment0003
initiator | public | t1_super | s3 | segment0003
e1 | public | t1_super | s3 | segment0003
initiator | public | t1_super | s3 | segment0003
(8 rows)