You can manage partitions with the following operations:
This is the multi-page printable view of this section. Click here to print.
Managing partitions
- 1: Dropping partitions
- 2: Archiving partitions
- 3: Swapping partitions
- 4: Minimizing partitions
- 5: Viewing partition storage data
1 - Dropping partitions
Use the
DROP_PARTITIONS
function to drop one or more partition keys for a given table. You can specify a single partition key or a range of partition keys.
For example, the table shown in Partitioning a new table is partitioned by column order_date
:
=> CREATE TABLE public.store_orders
(
order_no int,
order_date timestamp NOT NULL,
shipper varchar(20),
ship_date date
)
PARTITION BY YEAR(order_date);
Given this table definition, Vertica creates a partition key for each unique order_date
year—in this case, 2017, 2016, 2015, and 2014—and divides the data into separate ROS containers accordingly.
The following DROP_PARTITIONS
statement drops from table store_orders
all order records associated with partition key 2014:
=> SELECT DROP_PARTITIONS ('store_orders', 2014, 2014);
Partition dropped
Splitting partition groups
If a table partition clause includes a GROUP BY
clause, partitions are consolidated in the ROS by their partition group keys. DROP_PARTITIONS
can then specify a range of partition keys within a given partition group, or across multiple partition groups. In either case, the drop operation requires Vertica to split the ROS containers that store these partitions. To do so, the function's force_split
parameter must be set to true.
For example, the store_orders
table shown above can be repartitioned with a GROUP BY
clause as follows:
=> ALTER TABLE store_orders
PARTITION BY order_date::DATE GROUP BY DATE_TRUNC('year', (order_date)::DATE) REORGANIZE;
With all 2014 order records having been dropped earlier, order_date
values now span three years—2017, 2016, and 2015. Accordingly, the Tuple Mover creates three partition group keys for each year, and designates one or more ROS containers for each group. It then merges store_orders
partitions into the appropriate groups.
The following DROP_PARTITIONS
statement specifies to drop order dates that span two years, 2014 and 2015:
=> SELECT DROP_PARTITIONS('store_orders', '2015-05-30', '2016-01-16', 'true');
Partition dropped
The drop operation requires Vertica to drop partitions from two partition groups—2015 and 2016. These groups span at least two ROS containers, which must be split in order to remove the target partitions. Accordingly, the function's force_split
parameter is set to true.
Scheduling partition drops
If your hardware has fixed disk space, you might need to configure a regular process to roll out old data by dropping partitions.
For example, if you have only enough space to store data for a fixed number of days, configure Vertica to drop the oldest partition keys. To do so, create a time-based job scheduler such as cron
to schedule dropping the partition keys during low-load periods.
If the ingest rate for data has peaks and valleys, you can use two techniques to manage how you drop partition keys:
-
Set up a process to check the disk space on a regular (daily) basis. If the percentage of used disk space exceeds a certain threshold—for example, 80%—drop the oldest partition keys.
-
Add an artificial column in a partition that increments based on a metric like row count. For example, that column might increment each time the row count increases by 100 rows. Set up a process that queries this column on a regular (daily) basis. If the value in the new column exceeds a certain threshold—for example, 100—drop the oldest partition keys, and set the column value back to 0.
Table locking
DROP_PARTITIONS
acquires an exclusive O lock on the target table to block any DML operation (DELETE
, UPDATE
, INSERT
, or COPY
) that might affect table data. The lock also blocks SELECT
statements that are issued at SERIALIZABLE isolation level.
If the operation cannot obtain an O lock on the target table, Vertica tries to close any internal Tuple Mover sessions that are running on that table. If successful, the operation can proceed. Explicit Tuple Mover operations that are running in user sessions do not close. If an explicit Tuple Mover operation is running on the table, the operation proceeds only when the operation is complete.
2 - Archiving partitions
You can move partitions from one table to another with the Vertica function
MOVE_PARTITIONS_TO_TABLE
. This function is useful for archiving old partitions, as part of the following procedure:
-
Identify the partitions to archive, and move them to a temporary staging table with
MOVE_PARTITIONS_TO_TABLE
.
You restore archived partitions at any time.
Move partitions to staging tables
You archive historical data by identifying the partitions you wish to remove from a table. You then move each partition (or group of partitions) to a temporary staging table.
Before calling MOVE_PARTITIONS_TO_TABLE
:
- Refresh all out-of-date projections.
The following recommendations apply to staging tables:
-
To facilitate the backup process, create a unique schema for the staging table of each archiving operation.
-
Specify new names for staging tables. This ensures that they do not contain partitions from previous move operations.
If the table does not exist, Vertica creates a table from the source table's definition, by callingCREATE TABLE
withLIKE
andINCLUDING PROJECTIONS
clause. The new table inherits ownership from the source table. For details, see Replicating a table. -
Use staging names that enable other users to easily identify partition contents. For example, if a table is partitioned by dates, use a name that specifies a date or date range.
In the following example, MOVE_PARTITIONS_TO_TABLE
specifies to move a single partition to the staging table partn_backup.tradfes_200801
.
=> SELECT MOVE_PARTITIONS_TO_TABLE (
'prod_trades',
'200801',
'200801',
'partn_backup.trades_200801');
MOVE_PARTITIONS_TO_TABLE
-------------------------------------------------
1 distinct partition values moved at epoch 15.
(1 row)
Back up the staging table
After you create a staging table, you archive it through an object-level backup using a vbr
configuration file. For detailed information, see Backing up and restoring the database.
Important
Vertica recommends performing a full database backup before the object-level backup, as a precaution against data loss. You can only restore object-level backups to the original database.Drop the staging tables
After the backup is complete, you can drop the staging table as described in Dropping tables.
Restoring archived partitions
You can restore partitions that you previously moved to an intermediate table, archived as an object-level backup, and then dropped.
Note
Restoring an archived partition requires that the original table definition is unchanged since the partition was archived and dropped. If the table definition changed, you can restore an archived partition with INSERT...SELECT statements, which are not described here.These are the steps to restoring archived partitions:
-
Restore the backup of the intermediate table you saved when you moved one or more partitions to archive (see Archiving partitions).
-
Move the restored partitions from the intermediate table to the original table.
-
Drop the intermediate table.
3 - Swapping partitions
SWAP_PARTITIONS_BETWEEN_TABLES
combines the operations of
DROP_PARTITIONS
and
MOVE_PARTITIONS_TO_TABLE
as a single transaction. SWAP_PARTITIONS_BETWEEN_TABLES
is useful if you regularly load partitioned data from one table into another and need to refresh partitions in the second table.
For example, you might have a table of revenue that is partitioned by date, and you routinely move data into it from a staging table. Occasionally, the staging table contains data for dates that are already in the target table. In this case, you must first remove partitions from the target table for those dates, then replace them with the corresponding partitions from the staging table. You can accomplish both tasks with a single call to SWAP_PARTITIONS_BETWEEN_TABLES
.
By wrapping the drop and move operations within a single transaction, SWAP_PARTITIONS_BETWEEN_TABLES
maintains integrity of the swapped data. If any task in the swap operation fails, the entire operation fails and is rolled back.
Example
The following example creates two partitioned tables and then swaps certain partitions between them.
Both tables have the same definition and have partitions for various year
values. You swap the partitions where year
= 2008 and year
= 2009. Both tables have at least two rows to swap.
-
Create the
customer_info
table:=> CREATE TABLE customer_info ( customer_id INT NOT NULL, first_name VARCHAR(25), last_name VARCHAR(35), city VARCHAR(25), year INT NOT NULL) ORDER BY last_name PARTITION BY year;
-
Insert data into the
customer_info
table:COPY customer_info FROM STDIN; Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself. >> 1|Joe|Smith|Denver|2008 >> 2|Bob|Jones|Boston|2008 >> 3|Silke|Muller|Frankfurt|2007 >> 4|Simone|Bernard|Paris|2014 >> 5|Vijay|Kumar|New Delhi|2010 >> \.
-
View the table data:
=> SELECT * FROM customer_info ORDER BY year DESC; customer_id | first_name | last_name | city | year -------------+------------+-----------+-----------+------ 4 | Simone | Bernard | Paris | 2014 5 | Vijay | Kumar | New Delhi | 2010 1 | Joe | Smith | Denver | 2008 2 | Bob | Jones | Boston | 2008 3 | Silke | Muller | Frankfurt | 2007 (5 rows)
-
Create a second table,
member_info
, that has the same definition ascustomer_info
:=> CREATE TABLE member_info LIKE customer_info INCLUDING PROJECTIONS; CREATE TABLE
-
Insert data into the
member_info
table:=> COPY member_info FROM STDIN; Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself. >> 1|Jane|Doe|Miami|2001 >> 2|Mike|Brown|Chicago|2014 >> 3|Patrick|OMalley|Dublin|2008 >> 4|Ana|Lopez|Madrid|2009 >> 5|Mike|Green|New York|2008 >> \.
-
View the data in the
member_info
table:=> SELECT * FROM member_info ORDER BY year DESC; customer_id | first_name | last_name | city | year -------------+------------+-----------+----------+------ 2 | Mike | Brown | Chicago | 2014 4 | Ana | Lopez | Madrid | 2009 3 | Patrick | OMalley | Dublin | 2008 5 | Mike | Green | New York | 2008 1 | Jane | Doe | Miami | 2001 (5 rows)
-
To swap the partitions, run the SWAP_PARTITIONS_BETWEEN_TABLES function:
=> SELECT SWAP_PARTITIONS_BETWEEN_TABLES('customer_info', 2008, 2009, 'member_info'); SWAP_PARTITIONS_BETWEEN_TABLES ---------------------------------------------------------------------------------------------- 1 partition values from table customer_info and 2 partition values from table member_info are swapped at epoch 1045. (1 row)
-
Query both tables to confirm that they swapped their respective 2008 and 2009 records:
=> SELECT * FROM customer_info ORDER BY year DESC; customer_id | first_name | last_name | city | year -------------+------------+-----------+-----------+------ 4 | Simone | Bernard | Paris | 2014 5 | Vijay | Kumar | New Delhi | 2010 4 | Ana | Lopez | Madrid | 2009 3 | Patrick | OMalley | Dublin | 2008 5 | Mike | Green | New York | 2008 3 | Silke | Muller | Frankfurt | 2007 (6 rows) => SELECT * FROM member_info ORDER BY year DESC; customer_id | first_name | last_name | city | year -------------+------------+-----------+---------+------ 2 | Mike | Brown | Chicago | 2014 2 | Bob | Jones | Boston | 2008 1 | Joe | Smith | Denver | 2008 1 | Jane | Doe | Miami | 2001 (4 rows)
4 - Minimizing partitions
By default, Vertica supports up to 1024 ROS containers to store partitions for a given projection (see Projection parameters). A ROS container contains data that share the same partition key, or the same partition group key. Depending on the amount of data per partition, a partition or partition group can span multiple ROS containers.
Given this limit, it is inadvisable to partition a table on highly granular data—for example, on a TIMESTAMP column. Doing so can generate a very high number of partitions. If the number of partitions requires more than 1024 ROS containers, Vertica issues a ROS pushback warning and refuses to load more table data. A large number of ROS containers also can adversely affect DML operations such as DELETE
, which requires Vertica to open all ROS containers.
In practice, it is unlikely you will approach this maximum. For optimal performance, Vertica recommends that the number of ungrouped partitions range between 10 and 20, and not exceed 50. This range is typically compatible with most business requirements.
You can also reduce the number of ROS containers by grouping partitions. For more information, see Partition grouping and Hierarchical partitioning.
5 - Viewing partition storage data
Vertica provides various ways to view how your table partitions are organized and stored:
-
Query the
PARTITIONS
system table. -
Dump partition keys.
Querying PARTITIONS table
The following table and projection definitions partition store_order
data on order dates, and groups together partitions of the same year:
=> CREATE TABLE public.store_orders
(order_no int, order_date timestamp NOT NULL, shipper varchar(20), ship_date date)
PARTITION BY ((order_date)::date) GROUP BY (date_trunc('year', (order_date)::date));
=> CREATE PROJECTION public.store_orders_super
AS SELECT order_no, order_date, shipper, ship_date FROM store_orders
ORDER BY order_no, order_date, shipper, ship_date UNSEGMENTED ALL NODES;
=> COPY store_orders FROM '/home/dbadmin/export_store_orders_data.txt';
After loading data into this table, you can query the PARTITIONS
table to determine how many ROS containers store the grouped partitions for projection store_orders_unseg
, across all nodes. Each node has eight ROS containers, each container storing partitions of one partition group:
=> SELECT COUNT (partition_key) NumPartitions, ros_id, node_name FROM PARTITIONS
WHERE projection_name ilike 'store_orders%' GROUP BY ros_id, node_name ORDER BY node_name, NumPartitions;
NumPartitions | ros_id | node_name
---------------+-------------------+------------------
173 | 45035996274562779 | v_vmart_node0001
211 | 45035996274562791 | v_vmart_node0001
212 | 45035996274562783 | v_vmart_node0001
213 | 45035996274562787 | v_vmart_node0001
173 | 49539595901916471 | v_vmart_node0002
211 | 49539595901916483 | v_vmart_node0002
212 | 49539595901916475 | v_vmart_node0002
213 | 49539595901916479 | v_vmart_node0002
173 | 54043195529286985 | v_vmart_node0003
211 | 54043195529286997 | v_vmart_node0003
212 | 54043195529286989 | v_vmart_node0003
213 | 54043195529286993 | v_vmart_node0003
(12 rows)
Dumping partition keys
Vertica provides several functions that let you inspect how individual partitions are stored on the cluster, at several levels:
-
DUMP_PARTITION_KEYS
dumps partition keys of all projections in the system. -
DUMP_TABLE_PARTITION_KEYS
dumps partition keys of all projections for the specified table. -
DUMP_PROJECTION_PARTITION_KEYS
dumps partition keys of the specified projection.
Given the previous table and projection, DUMP_PROJECTION_PARTITION_KEYS
shows the contents of four ROS containers on each node:
=> SELECT DUMP_PROJECTION_PARTITION_KEYS('store_orders_super');
...
Partition keys on node v_vmart_node0001
Projection 'store_orders_super'
Storage [ROS container]
No of partition keys: 173
Partition keys: 2017-01-02 2017-01-03 2017-01-04 2017-01-05 2017-01-06 2017-01-09 2017-01-10
2017-01-11 2017-01-12 2017-01-13 2017-01-16 2017-01-17 2017-01-18 2017-01-19 2017-01-20 2017-01-23
2017-01-24 2017-01-25 2017-01-26 2017-01-27 2017-02-01 2017-02-02 2017-02-03 2017-02-06 2017-02-07
2017-02-08 2017-02-09 2017-02-10 2017-02-13 2017-02-14 2017-02-15 2017-02-16 2017-02-17 2017-02-20
...
2017-09-01 2017-09-04 2017-09-05 2017-09-06 2017-09-07 2017-09-08 2017-09-11 2017-09-12 2017-09-13
2017-09-14 2017-09-15 2017-09-18 2017-09-19 2017-09-20 2017-09-21 2017-09-22 2017-09-25 2017-09-26 2017-09-27
Storage [ROS container]
No of partition keys: 212
Partition keys: 2016-01-01 2016-01-04 2016-01-05 2016-01-06 2016-01-07 2016-01-08 2016-01-11
2016-01-12 2016-01-13 2016-01-14 2016-01-15 2016-01-18 2016-01-19 2016-01-20 2016-01-21 2016-01-22
2016-01-25 2016-01-26 2016-01-27 2016-02-01 2016-02-02 2016-02-03 2016-02-04 2016-02-05 2016-02-08
2016-02-09 2016-02-10 2016-02-11 2016-02-12 2016-02-15 2016-02-16 2016-02-17 2016-02-18 2016-02-19
...
2016-11-01 2016-11-02 2016-11-03 2016-11-04 2016-11-07 2016-11-08 2016-11-09 2016-11-10 2016-11-11
2016-11-14 2016-11-15 2016-11-16 2016-11-17 2016-11-18 2016-11-21 2016-11-22 2016-11-23 2016-11-24 2016-11-25
Storage [ROS container]
No of partition keys: 213
Partition keys: 2015-01-01 2015-01-02 2015-01-05 2015-01-06 2015-01-07 2015-01-08 2015-01-09
2015-01-12 2015-01-13 2015-01-14 2015-01-15 2015-01-16 2015-01-19 2015-01-20 2015-01-21 2015-01-22
2015-01-23 2015-01-26 2015-01-27 2015-02-02 2015-02-03 2015-02-04 2015-02-05 2015-02-06 2015-02-09
2015-02-10 2015-02-11 2015-02-12 2015-02-13 2015-02-16 2015-02-17 2015-02-18 2015-02-19 2015-02-20
...
2015-11-02 2015-11-03 2015-11-04 2015-11-05 2015-11-06 2015-11-09 2015-11-10 2015-11-11 2015-11-12
2015-11-13 2015-11-16 2015-11-17 2015-11-18 2015-11-19 2015-11-20 2015-11-23 2015-11-24 2015-11-25
2015-11-26 2015-11-27
Storage [ROS container]
No of partition keys: 211
Partition keys: 2014-01-01 2014-01-02 2014-01-03 2014-01-06 2014-01-07 2014-01-08 2014-01-09
2014-01-10 2014-01-13 2014-01-14 2014-01-15 2014-01-16 2014-01-17 2014-01-20 2014-01-21 2014-01-22
2014-01-23 2014-01-24 2014-01-27 2014-02-03 2014-02-04 2014-02-05 2014-02-06 2014-02-07 2014-02-10
2014-02-11 2014-02-12 2014-02-13 2014-02-14 2014-02-17 2014-02-18 2014-02-19 2014-02-20 2014-02-21
...
2014-11-04 2014-11-05 2014-11-06 2014-11-07 2014-11-10 2014-11-11 2014-11-12 2014-11-13 2014-11-14
2014-11-17 2014-11-18 2014-11-19 2014-11-20 2014-11-21 2014-11-24 2014-11-25 2014-11-26 2014-11-27
Storage [ROS container]
No of partition keys: 173
...