Dropping partitions
Use the DROP_PARTITIONS function to drop one or more partition keys for a given table. You can specify a single partition key or a range of partition keys.
For example, the table shown in Partitioning a new table is partitioned by column order_date
:
=> CREATE TABLE public.store_orders
(
order_no int,
order_date timestamp NOT NULL,
shipper varchar(20),
ship_date date
)
PARTITION BY YEAR(order_date);
Given this table definition, Vertica creates a partition key for each unique order_date
year—in this case, 2017, 2016, 2015, and 2014—and divides the data into separate ROS containers accordingly.
The following DROP_PARTITIONS statement drops from table store_orders
all order records associated with partition key 2014:
=> SELECT DROP_PARTITIONS ('store_orders', 2014, 2014);
Partition dropped
Splitting partition groups
If a table partition clause includes a GROUP BY clause, partitions are consolidated in the ROS by their partition group keys. DROP_PARTITIONS can then specify a range of partition keys within a given partition group, or across multiple partition groups. In either case, the drop operation requires Vertica to split the ROS containers that store these partitions. To do so, the function's force_split
parameter must be set to true.
For example, the store_orders
table shown above can be repartitioned with a GROUP BY clause as follows:
=> ALTER TABLE store_orders
PARTITION BY order_date::DATE GROUP BY DATE_TRUNC('year', (order_date)::DATE) REORGANIZE;
With all 2014 order records having been dropped earlier, order_date
values now span three years—2017, 2016, and 2015. Accordingly, the Tuple Mover creates three partition group keys for each year, and designates one or more ROS containers for each group. It then merges store_orders
partitions into the appropriate groups.
The following DROP_PARTITIONS statement specifies to drop order dates that span two years, 2014 and 2015:
=> SELECT DROP_PARTITIONS('store_orders', '2015-05-30', '2016-01-16', 'true');
Partition dropped
The drop operation requires Vertica to drop partitions from two partition groups—2015 and 2016. These groups span at least two ROS containers, which must be split in order to remove the target partitions. Accordingly, the function's force_split
parameter is set to true.
Scheduling partition drops
If your hardware has fixed disk space, you might need to configure a regular process to roll out old data by dropping partitions.
For example, if you have only enough space to store data for a fixed number of days, configure Vertica to drop the oldest partition keys. To do so, create a time-based job scheduler such as cron
to schedule dropping the partition keys during low-load periods.
If the ingest rate for data has peaks and valleys, you can use two techniques to manage how you drop partition keys:
- Set up a process to check the disk space on a regular (daily) basis. If the percentage of used disk space exceeds a certain threshold—for example, 80%—drop the oldest partition keys.
- Add an artificial column in a partition that increments based on a metric like row count. For example, that column might increment each time the row count increases by 100 rows. Set up a process that queries this column on a regular (daily) basis. If the value in the new column exceeds a certain threshold—for example, 100—drop the oldest partition keys, and set the column value back to 0.
Table locking
DROP_PARTITIONS requires an exclusive D lock on the target table. This lock is only compatible with I-lock operations, so only table load operations such as INSERT and COPY are allowed during drop partition operations.