SANDBOX_SUBCLUSTER

Creates a sandbox for a secondary subcluster.

Note

Vertica recommends using the admintools sandbox_subcluster command to create sandboxes. This command includes additional sanity checks and validates that the sandboxed nodes are UP after sandbox creation. For details, see Creating sandboxes.

At function runtime, the nodes in the specified subcluster create a checkpoint of the catalog. When these nodes auto-restart in the sandbox, they form a new primary cluster that uses the data and catalog checkpoint from the main cluster. After the nodes successfully restart, the sandboxed subcluster and the main cluster are mutually isolated and can diverge.

While the nodes in the main cluster sync their metadata to /path-to-communal-storage/`metadata`/db_name, the nodes in the sandbox sync to /path-to-communal-storage/`metadata`/sandbox_name.

You can perform standard database operations and queries, such as loading data or creating new tables, in either cluster without affecting the other cluster. For example, dropping a table in the sandboxed subcluster does not drop the table in the main cluster, and vice versa.

Because both clusters reference the same data files, neither cluster can delete files that existed at the time of sandbox creation. However, files that are created in the sandbox can be removed. Files in the main cluster can be queued for removal, but they are not processed until all active sandboxes are removed.

You cannot nest sandboxes, but you can have multiple individual sandboxes active at the same time.

This is a meta-function. You must call-meta-functions in a top-level SELECT statement. The function also requires a global catalog lock (GCLX) during runtime.

Behavior type

Volatile

Syntax

SANDBOX_SUBCLUSTER( 'sandbox-name', 'subcluster-name', 'options' )

Arguments

sandbox-name

Name of the sandbox. The name must conform to the following rules:

Consist of at most 30 characters, all of which must have an ASCII code between 36 and 126
Begin with a letter
Unique among all existing databases and sandboxes

subcluster-name

Name of the secondary subcluster to sandbox. Attempting to sandbox a primary subcluster or a subcluster that is already sandboxed results in an error. The nodes in the subcluster must all have a status of UP and provide full subscription coverage for all shards.

options

Currently, there are no options for this function.

Privileges

Superuser

Examples

The following example sandboxes the sc02 secondary subcluster into a sandbox named sand:

=> SELECT SANDBOX_SUBCLUSTER('sand', 'sc_02', '');
                                SANDBOX_SUBCLUSTER
-----------------------------------------------------------------------------------------------
Subcluster 'sc_02' has been sandboxed to 'sand'. It is going to auto-restart and re-form.
(1 row)

If you query the NODES system table from the main cluster, you can see that the nodes of sc_02 have a status of UNKNOWN and are listed as member of the sand sandbox:

=> SELECT node_name, subcluster_name, node_state, sandbox FROM NODES;
      node_name       |  subcluster_name   | node_state | sandbox
----------------------+--------------------+------------+---------
 v_verticadb_node0001 | default_subcluster | UP         |
 v_verticadb_node0002 | default_subcluster | UP         |
 v_verticadb_node0003 | default_subcluster | UP         |
 v_verticadb_node0004 | sc_02              | UNKNOWN    | sand
 v_verticadb_node0005 | sc_02              | UNKNOWN    | sand
 v_verticadb_node0006 | sc_02              | UNKNOWN    | sand
(6 rows)

When you issue the same query on one of the sandboxed nodes, the table shows that the sandboxed nodes are UP and the nodes from the main cluster are UNKNOWN, confirming that the cluster is successfully sandboxed:

=> SELECT node_name, subcluster_name, node_state, sandbox FROM NODES;
      node_name       |  subcluster_name   | node_state | sandbox
----------------------+--------------------+------------+---------
 v_verticadb_node0001 | default_subcluster | UNKNOWN    |
 v_verticadb_node0002 | default_subcluster | UNKNOWN    |
 v_verticadb_node0003 | default_subcluster | UNKNOWN    |
 v_verticadb_node0004 | sc_02              | UP         | sand
 v_verticadb_node0005 | sc_02              | UP         | sand
 v_verticadb_node0006 | sc_02              | UP         | sand
(6 rows)

You can now perform standard database operations in either cluster without impacting the other cluster. For instance, if you create a machine learning dataset named train_data in the sandboxed subcluster, the new table does not propagate to the main cluster:

--In the sandboxed subcluster
=> CREATE TABLE train_data(time timestamp, Temperature float);
CREATE TABLE

=> COPY train_data FROM LOCAL 'daily-min-temperatures.csv' DELIMITER ',';
 Rows Loaded
-------------
    3650
(1 row)

=> SELECT * FROM train_data LIMIT 5;
        time         | Temperature
---------------------+-------------
 1981-01-27 00:00:00 |        19.4
 1981-02-20 00:00:00 |        15.7
 1981-02-27 00:00:00 |        17.5
 1981-03-04 00:00:00 |          16
 1981-04-24 00:00:00 |        11.5
(5 rows)

--In the main cluster
=> SELECT * FROM train_data LIMIT 5;
ERROR 4566:  Relation "train_data" does not exist

SANDBOX_SUBCLUSTER

Note

Behavior type

Syntax

Arguments

Privileges

Examples

See also