This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
Subcluster sandboxing
Sandboxing enables you to spin-off a secondary subcluster from an existing cluster, resulting in two mutually isolated clusters that share the same data but do not interfere with each other.
Sandboxing enables you to spin-off secondary subclusters from an existing cluster, resulting in two or more mutually isolated clusters that share the same data but do not interfere with each other. Sandboxed clusters inherit the state of the catalog and data of the main cluster at the time of sandbox creation. As soon as the sandbox is active, the catalog and data of the two clusters are independent and can diverge. Within each cluster, you can perform standard database operations and queries, such as creating new tables or loading libraries, without affecting the other cluster. For example, dropping a table in the sandboxed cluster does not drop the table in the main cluster, and vice versa.
Sandboxes support many use cases, including the following:
-
Try out a version of Vertica without needing to spin-up a new cluster and reload data. After creating a sandbox, you can upgrade sandboxed subclusters and test out the new Vertica features. To rejoin sandboxed subclusters to the main cluster, you just need to downgrade the Vertica version and perform the necessary unsandboxing tasks.
-
Experiment with new features without compromising the consistency of the main cluster. For instance, you could spin-off a sandbox and experiment with external tables using data stored by Apache Iceberg.
-
Share data with another team by giving them access to a sandboxed cluster. This gives the other team the necessary data access, but keeps the changes separate from the main cluster. Anything the other team changes in the sandbox, such as dropping tables, would not propagate back to the main cluster.
After removing the sandbox and performing necessary cleanup tasks, subclusters can rejoin the main cluster.
1 - Creating sandboxes
In order to create a sandbox for a secondary subcluster, all nodes in the subcluster must have a status of UP and collectively provide full-subscription coverage for all shards.
In order to create a sandbox for a secondary subcluster, all nodes in the subcluster must have a status of UP and collectively provide full-subscription coverage for all shards.
To sandbox the first subcluster in a sandbox, use the sandbox_subcluster
admintools command:
$ adminTools -t sandbox_subcluster -h
Usage: sandbox_subcluster [options]
Options:
-h, --help show this help message and exit
-d DB, --database=DB Name of database to be modified
-p DBPASSWORD, --password=DBPASSWORD
Database password in single quotes
-c SCNAME, --subcluster=SCNAME
Name of subcluster to be sandboxed
-b SBNAME, --sandbox=SBNAME
Name of the sandbox
--timeout=NONINTERACTIVE_TIMEOUT
set a timeout (in seconds) to wait for actions to
complete ('never') will wait forever (implicitly sets
-i)
-i, --noprompts do not stop and wait for user input(default false).
Setting this implies a timeout of 20 min.
At command runtime, under a global catalog lock (GCLX), the nodes in the specified subcluster create a checkpoint of the catalog. When these nodes auto-restart in the sandbox cluster, they form a primary subcluster that uses the data and catalog checkpoint from the main cluster. After the nodes successfully restart, the sandbox cluster and the main cluster are mutually isolated and can diverge.
While the nodes in the main cluster sync their metadata to /path-to-communal-storage/
metadata
/db_name
, the nodes in the sandbox sync to /path-to-communal-storage/
metadata
/sandbox_name
.
Note
Though the sandbox cluster and main cluster are mutually isolated, the nodes in both clusters remain in the same
Spread ring. However, if the main cluster and sandbox cluster partition from each other—which can be caused by events like Spread communication problems—the two clusters are not affected.
You can perform standard database operations and queries, such as loading data or creating new tables, in either cluster without affecting the other cluster. For example, dropping a table in the sandbox cluster does not drop the table in the main cluster, and vice versa.
Because both clusters reference the same data files, neither cluster can delete files that existed at the time of sandbox creation. However, the sandbox can remove files that it creates after spin-off from the main cluster. Files in the main cluster are queued for removal, but they are not processed until all active sandboxes are removed.
You cannot nest sandboxes, remove a sandboxed subcluster from the database, or add or remove nodes to an existing sandbox, but you can have multiple individual sandboxes active at the same time.
Adding subclusters to existing sandboxes
You can add additional secondary subclusters to existing sandbox clusters using the SANDBOX_SUBCLUSTER function. Added subclusters must be secondary, cannot be a member of any other sandbox cluster, and all member nodes must have a status of UP.
When sandboxing additional subclusters, you must first call the SANDBOX_SUBCLUSTER function in the sandbox cluster and then in the main cluster, providing the same sandbox cluster and subcluster names in both calls. This makes both clusters aware of the status of the sandboxed subcluster. After the function is called in both clusters, the subcluster automatically restarts, joins the sandbox cluster as a secondary subcluster, and subscribes to the sandbox cluster's shards.
Examples
The following command sandboxes the sc01
secondary subcluster into a sandbox named sand
:
$ admintools -t sandbox_subcluster -d verticadb -p password -c sc_01 -b sand
Validating sandboxing conditions
Sandboxing subcluster sc_01 as sand...
Subcluster 'sc_01' has been sandboxed to 'sand'. It is going to auto-restart and re-form.
Checking for sandboxed nodes to be UP...
Node Status: v_verticadb_node0004: (DOWN) v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
Node Status: v_verticadb_node0004: (UP) v_verticadb_node0005: (UP) v_verticadb_node0006: (UP)
Sandboxing complete. Subcluster sc_01 is ready for use
If you query the NODES system table from the main cluster, you can see that the sc_01
nodes have a status of UNKNOWN and are listed as member of the sand
sandbox:
=> SELECT node_name, subcluster_name, node_state, sandbox FROM NODES;
node_name | subcluster_name | node_state | sandbox
----------------------+--------------------+------------+---------
v_verticadb_node0001 | default_subcluster | UP |
v_verticadb_node0002 | default_subcluster | UP |
v_verticadb_node0003 | default_subcluster | UP |
v_verticadb_node0004 | sc_01 | UNKNOWN | sand
v_verticadb_node0005 | sc_01 | UNKNOWN | sand
v_verticadb_node0006 | sc_01 | UNKNOWN | sand
v_verticadb_node0007 | sc_02 | UP |
v_verticadb_node0008 | sc_02 | UP |
v_verticadb_node0009 | sc_02 | UP |
(9 rows)
Note
If the sandboxed and main clusters are sufficiently incompatible or are Spread-partitioned from each other, the status of the sandboxed nodes might appear as DOWN in the above query.
When you issue the same query on one of the sandboxed nodes, the table shows that the sandboxed nodes are UP and the nodes from the main cluster are UNKNOWN, confirming that the cluster is successfully sandboxed and isolated from the main cluster:
=> SELECT node_name, subcluster_name, node_state, sandbox FROM NODES;
node_name | subcluster_name | node_state | sandbox
----------------------+--------------------+------------+---------
v_verticadb_node0001 | default_subcluster | UNKNOWN |
v_verticadb_node0002 | default_subcluster | UNKNOWN |
v_verticadb_node0003 | default_subcluster | UNKNOWN |
v_verticadb_node0004 | sc_01 | UP | sand
v_verticadb_node0005 | sc_01 | UP | sand
v_verticadb_node0006 | sc_01 | UP | sand
v_verticadb_node0007 | sc_02 | UNKNOWN |
v_verticadb_node0008 | sc_02 | UNKNOWN |
v_verticadb_node0009 | sc_02 | UNKNOWN |
(9 rows)
You can now perform standard database operations in either cluster without impacting the other cluster. For instance, if you create a machine learning dataset named train_data
in the sandboxed subcluster, the new table does not propagate to the main cluster:
--In the sandboxed subcluster
=> CREATE TABLE train_data(time timestamp, Temperature float);
CREATE TABLE
=> COPY train_data FROM LOCAL 'daily-min-temperatures.csv' DELIMITER ',';
Rows Loaded
-------------
3650
(1 row)
=> SELECT * FROM train_data LIMIT 5;
time | Temperature
---------------------+-------------
1981-01-27 00:00:00 | 19.4
1981-02-20 00:00:00 | 15.7
1981-02-27 00:00:00 | 17.5
1981-03-04 00:00:00 | 16
1981-04-24 00:00:00 | 11.5
(5 rows)
--In the main cluster
=> SELECT * FROM train_data LIMIT 5;
ERROR 4566: Relation "train_data" does not exist
Similarly, if you drop a table in the main cluster, the table is not subsequently dropped in the sandboxed cluster:
--In the main cluster
=> SELECT * FROM transaction_data LIMIT 5;
first_name | last_name | store | cost | fraud
------------+-----------+-----------+---------+-------
Adam | Rice | Gembucket | 8757.35 | FALSE
Alan | Gordon | Wrapsafe | 3874.48 | FALSE
Albert | Harvey | Treeflex | 1558.27 | FALSE
Andrea | Bryant | Greenlam | 1148.2 | FALSE
Andrew | Simmons | Home Ing | 8400.03 | FALSE
(5 rows)
=> DROP TABLE transaction_data;
DROP TABLE
--In the sandboxed subcluster
=> SELECT * FROM transaction_data LIMIT 5;
first_name | last_name | store | cost | fraud
------------+-----------+-----------+---------+-------
Adam | Rice | Gembucket | 8757.35 | FALSE
Alan | Gordon | Wrapsafe | 3874.48 | FALSE
Albert | Harvey | Treeflex | 1558.27 | FALSE
Andrea | Bryant | Greenlam | 1148.2 | FALSE
Andrew | Simmons | Home Ing | 8400.03 | FALSE
(5 rows)
To add an additional secondary subcluster to the sandbox cluster, call SANDBOX_SUBCLUSTER in the sandbox cluster and then in the main cluster:
--In the sandbox cluster
=> SELECT SANDBOX_SUBCLUSTER('sand', 'sc_02', '');
sandbox_subcluster
------------------------------------------------------------------------------------------------------------------------------
Subcluster 'sc_02' has been added to 'sand'. When the same command is executed in the main cluster, it can join the sandbox.
(1 row)
--In the main cluster
=> SELECT SANDBOX_SUBCLUSTER('sand', 'sc_02', '');
sandbox_subcluster
-------------------------------------------------------------------------------------------
Subcluster 'sc_02' has been sandboxed to 'sand'. It is going to auto-restart and re-form.
(1 row)
--In the sandbox cluster
=> SELECT node_name, subcluster_name, node_state, sandbox FROM NODES;
node_name | subcluster_name | node_state | sandbox
----------------------+--------------------+------------+---------
v_verticadb_node0001 | default_subcluster | UNKNOWN |
v_verticadb_node0002 | default_subcluster | UNKNOWN |
v_verticadb_node0003 | default_subcluster | UNKNOWN |
v_verticadb_node0004 | sc_01 | UP | sand
v_verticadb_node0005 | sc_01 | UP | sand
v_verticadb_node0006 | sc_01 | UP | sand
v_verticadb_node0007 | sc_02 | UP | sand
v_verticadb_node0008 | sc_02 | UP | sand
v_verticadb_node0009 | sc_02 | UP | sand
(9 rows)
If you decide to upgrade your sandboxed cluster, you can confirm that the main cluster and sandboxed cluster are running two different Vertica versions by comparing their build_info
values in the NODES system table:
=> SELECT node_name, subcluster_name, node_state, sandbox, build_info FROM NODES;
node_name | subcluster_name | node_state | sandbox | build_info
----------------------+--------------------+------------+---------+------------
v_verticadb_node0001 | default_subcluster | UP | | v12.0.4-0
v_verticadb_node0002 | default_subcluster | UP | | v12.0.4-0
v_verticadb_node0003 | default_subcluster | UP | | v12.0.4-0
v_verticadb_node0004 | sc_01 | UNKNOWN | sand | v12.0.4-1
v_verticadb_node0005 | sc_01 | UNKNOWN | sand | v12.0.4-1
v_verticadb_node0006 | sc_01 | UNKNOWN | sand | v12.0.4-1
v_verticadb_node0007 | sc_02 | UNKNOWN | sand | v12.0.4-1
v_verticadb_node0008 | sc_02 | UNKNOWN | sand | v12.0.4-1
v_verticadb_node0009 | sc_02 | UNKNOWN | sand | v12.0.4-1
(9 rows)
See also
2 - Removing sandboxes
To remove a sandbox from a subcluster and return that subcluster to the main cluster, you can run the unsandbox_subcluster admintools command:.
To remove a sandbox's primary subcluster from the sandbox and return it to the main cluster, you can run the unsandbox_subcluster
admintools command:
$ adminTools -t unsandbox_subcluster -h
Usage: unsandbox_subcluster [options]
Options:
-h, --help show this help message and exit
-d DB, --database=DB Name of database to be modified
-p DBPASSWORD, --password=DBPASSWORD
Database password in single quotes
-c SCNAME, --subcluster=SCNAME
Name of subcluster to be un-sandboxed
--timeout=NONINTERACTIVE_TIMEOUT
set a timeout (in seconds) to wait for actions to
complete ('never') will wait forever (implicitly sets
-i)
-i, --noprompts do not stop and wait for user input(default false).
Setting this implies a timeout of 20 min.
Note
If you upgraded the Vertica version of the sandboxed subcluster, you must downgrade the version of the subcluster before rejoining it to the main cluster.
The unsandbox_subcluster
command stops the nodes in the sandboxed subcluster, changes the metadata in the main cluster that designates the specified subcluster as sandboxed, wipes the node's local catalogs, and then restarts the nodes. After the nodes restart, they rejoin the main cluster and inherit the current state of the main cluster's catalog. The nodes should then be back to their normal state and can be used as expected.
Because the sandbox synced its metadata to the same communal storage location as the main cluster, you must remove the metadata files that were created in the sandbox. Those files can be found by replacing the name of the database in the path to the metadata with the name of the sandbox—for instance, /path-to-communal-storage/
metadata
/sandbox_name
instead of /path-to-communal-storage/
metadata
/db_name
. Removing these files helps avoid problems that might arise from reusing the same sandbox name.
If there are no more active sandboxes, the main cluster can resume the processing of data queued for deletion. To remove any data created in the sandbox, you can run the CLEAN_COMMUNAL_STORAGE function.
You can also unsandbox a subcluster using the UNSANDBOX_SUBCLUSTER meta-function, but you must manually stop the nodes, wipe their catalog subdirectories, run the function, and restart the nodes.
Removing a sandbox's secondary subclusters
If your sandbox has additional secondary subclusters, you can remove them from the sandbox using the UNSANDBOX_SUBCLUSTER function. As with the unsandbox_subcluster
admintools command, the function requires that the nodes in the specified subcluster are down. Any remaining subclusters in the sandbox cluster continue to operate as normal.
After you call the function in the main cluster and wipe the nodes' catalogs, you can restart the nodes to rejoin them to the main cluster. Vertica recommends that you also call the UNSANDBOX_SUBCLUSTER function in the sandbox cluster. This makes sure that both clusters are aware of the status of the subcluster and that relevant system tables accurately reflect the subcluster's state.
Examples
The following command unsandboxes the sc02
secondary subcluster from the sand
sandbox. The command stops the nodes in sc02
, wipes the nodes catalogs, and then restarts the nodes. After the nodes restart, they should rejoin the main cluster and be ready for normal use:
$ admintools -t unsandbox_subcluster -d verticadb -p vertica -c analytics
Stopping subcluster nodes for unsandboxing...
Sending signal 'TERM' to ['192.168.111.34', '192.168.111.35', '192.168.111.36']
Successfully sent signal 'TERM' to hosts ['192.168.111.34', '192.168.111.35', '192.168.111.36'].
Details:
Host: 192.168.111.34 - Success - PID: 267860 Signal TERM
Host: 192.168.111.35 - Success - PID: 285917 Signal TERM
Host: 192.168.111.36 - Success - PID: 245272 Signal TERM
Checking for processes to be down
All processes are down.
Details:
Host 192.168.111.34 Success process 267860 is down
Host 192.168.111.35 Success process 285917 is down
Host 192.168.111.36 Success process 245272 is down
Unsandboxing Subcluster analytics...
STATUS: vertica.engine.api.db_client.module is still running on 1 host: 192.168.111.31 as of 2023-03-01 13:23:37. See /opt/vertica/log/adminTools.log for full details.
STATUS: vertica.engine.api.db_client.module is still running on 1 host: 192.168.111.31 as of 2023-03-01 13:23:47. See /opt/vertica/log/adminTools.log for full details.
STATUS: vertica.engine.api.db_client.module is still running on 1 host: 192.168.111.31 as of 2023-03-01 13:23:57. See /opt/vertica/log/adminTools.log for full details.
STATUS: vertica.engine.api.db_client.module is still running on 1 host: 192.168.111.31 as of 2023-03-01 13:24:07. See /opt/vertica/log/adminTools.log for full details.
Subcluster 'analytics' has been unsandboxed. If wiped out and restarted, it should be able to rejoin the cluster.
Removing Catalog directory contents from subcluster nodes...
Catalog cleanup complete!
Restarting unsandboxed nodes to re-join the main cluster...
Restarting host [192.168.111.34] with catalog [v_verticadb_node0004_catalog]
Restarting host [192.168.111.35] with catalog [v_verticadb_node0005_catalog]
Restarting host [192.168.111.36] with catalog [v_verticadb_node0006_catalog]
Issuing multi-node restart
Starting nodes:
v_verticadb_node0004 (192.168.111.34)
v_verticadb_node0005 (192.168.111.35)
v_verticadb_node0006 (192.168.111.36)
Starting Vertica on all nodes. Please wait, databases with a large catalog may take a while to initialize.
Node Status: v_verticadb_node0004: (DOWN) v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
Node Status: v_verticadb_node0004: (DOWN) v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
Node Status: v_verticadb_node0004: (UP) v_verticadb_node0005: (UP) v_verticadb_node0006: (UP)
Syncing catalog on verticadb with 2000 attempts.
Unsandboxed nodes have restarted successfully and joined the main cluster and are ready to use
When the admintools command completes, you can query the NODES system table to confirm that the previously sandboxed nodes are UP and are no longer members of sand
:
=> SELECT node_name, subcluster_name, node_state, sandbox FROM NODES;
node_name | subcluster_name | node_state | sandbox
----------------------+--------------------+------------+---------
v_verticadb_node0001 | default_subcluster | UP |
v_verticadb_node0002 | default_subcluster | UP |
v_verticadb_node0003 | default_subcluster | UP |
v_verticadb_node0004 | sc_02 | UP |
v_verticadb_node0005 | sc_02 | UP |
v_verticadb_node0006 | sc_02 | UP |
(6 rows)
If there are no more active sandboxes, you can run the CLEAN_COMMUNAL_STORAGE function to remove any data created in the sandbox. You should also remove the sandbox's metadata from the shared communal storage location, which can be found at /path-to-communal-storage/
metadata
/sandbox_name
. The following example removes the sandbox's metadata from an S3 bucket and then calls CLEAN_COMMUNAL_STORAGE to cleanup any data from the sandbox:
$ aws s3 rm /path-to-communal/metadata/sandbox_name
SELECT CLEAN_COMMUNAL_STORAGE('true');
CLEAN_COMMUNAL_STORAGE
-----------------------------------------------------------------
CLEAN COMMUNAL STORAGE
Total leaked files: 143
Files have been queued for deletion.
Check communal_cleanup_records for more information.
(1 row)
See also