1 - Partitioning a new table
Use CREATE TABLE to partition a new table, as specified by the PARTITION BY clause:.
Use CREATE TABLE to partition a new table, as specified by the PARTITION BY clause:
CREATE TABLE table-name... PARTITION BY partition-expression [ GROUP BY group-expression ] [ REORGANIZE ];
The following statements create the store_orders
table and load data into it. The CREATE TABLE statement includes a simple partition clause that specifies to partition data by year:
=> CREATE TABLE public.store_orders
(
order_no int,
order_date timestamp NOT NULL,
shipper varchar(20),
ship_date date
)
UNSEGMENTED ALL NODES
PARTITION BY YEAR(order_date);
CREATE TABLE
=> COPY store_orders FROM '/home/dbadmin/export_store_orders_data.txt';
41834
As COPY loads the new table data into ROS storage, the Tuple Mover executes the table's partition clause by dividing orders for each year into separate partitions, and consolidating these partitions in ROS containers.
In this case, the Tuple Mover creates four partition keys for the loaded data—2017, 2016, 2015, and 2014—and divides the data into separate ROS containers accordingly:
=> SELECT dump_table_partition_keys('store_orders');
... Partition keys on node v_vmart_node0001
Projection 'store_orders_super'
Storage [ROS container]
No of partition keys: 1
Partition keys: 2017
Storage [ROS container]
No of partition keys: 1
Partition keys: 2016
Storage [ROS container]
No of partition keys: 1
Partition keys: 2015
Storage [ROS container]
No of partition keys: 1
Partition keys: 2014
Partition keys on node v_vmart_node0002
Projection 'store_orders_super'
Storage [ROS container]
No of partition keys: 1
Partition keys: 2017
...
(1 row)
As new data is loaded into store_orders
, the Tuple Mover merges it into the appropriate partitions, creating partition keys as needed for new years.
2 - Partitioning existing table data
Use ALTER TABLE to partition or repartition an existing table, as specified by the PARTITION BY clause:.
Use ALTER TABLE to partition or repartition an existing table, as specified by the PARTITION BY
clause:
ALTER TABLE table-name PARTITION BY partition-expression [ GROUP BY group-expression ] [ REORGANIZE ];
For example, you might repartition the store_orders
table, defined earlier. The following ALTER TABLE divides all store_orders
data into monthly partitions for each year, each partition key identifying the order date year and month:
=> ALTER TABLE store_orders
PARTITION BY EXTRACT(YEAR FROM order_date)*100 + EXTRACT(MONTH FROM order_date)
GROUP BY EXTRACT(YEAR from order_date)*100 + EXTRACT(MONTH FROM order_date);
NOTICE 8364: The new partitioning scheme will produce partitions in 42 physical storage containers per projection
WARNING 6100: Using PARTITION expression that returns a Numeric value
HINT: This PARTITION expression may cause too many data partitions. Use of an expression that returns a more accurate value, such as a regular VARCHAR or INT, is encouraged
WARNING 4493: Queries using table "store_orders" may not perform optimally since the data may not be repartitioned in accordance with the new partition expression
HINT: Use "ALTER TABLE public.store_orders REORGANIZE;" to repartition the data
After executing this statement, Vertica drops existing partition keys. However, the partition clause omits REORGANIZE, so existing data remains stored according to the previous partition clause. This can put table partitioning in an inconsistent state and adversely affect query performance, DROP_PARTITIONS, and node recovery. In this case, you must explicitly request Vertica to reorganize existing data into new partitions, in one of the following ways:
-
Issue ALTER TABLE...REORGANIZE:
ALTER TABLE table-name REORGANIZE;
-
Call the Vertica meta-function PARTITION_TABLE.
For example:
=> ALTER TABLE store_orders REORGANIZE;
NOTICE 4785: Started background repartition table task
ALTER TABLE
ALTER TABLE...REORGANIZE and PARTITION_TABLE operate identically: both split any ROS containers where partition keys do not conform with the new partition clause. On executing its next mergeout, the Tuple Mover merges partitions into the appropriate ROS containers.
3 - Partition grouping
Partition groups consolidate partitions into logical subsets that minimize use of ROS storage.
Partition groups consolidate partitions into logical subsets that minimize use of ROS storage. Reducing the number of ROS containers to store partitioned data helps facilitate DML operations such as DELETE
and UPDATE
, and avoid ROS pushback. For example, you can group date partitions by year. By doing so, the Tuple Mover allocates ROS containers for each year group, and merges individual partitions into these ROS containers accordingly.
Creating partition groups
You create partition groups by qualifying the PARTITION BY
clause with a GROUP BY
clause:
ALTER TABLE table-name PARTITION BY partition-expression [ GROUP BY group-expression ]
The GROUP BY
clause specifies how to consolidate partition keys into groups, where each group is identified by a unique partition group key. For example, the following
ALTER TABLE
statement specifies to repartition the store_orders
table (shown in Partitioning a new table) by order dates, grouping partition keys by year. The group expression—DATE_TRUNC('year', (order_date)::DATE)
—uses the partition expression order_date::DATE
to generate partition group keys:
=> ALTER TABLE store_orders
PARTITION BY order_date::DATE GROUP BY DATE_TRUNC('year', (order_date)::DATE) REORGANIZE;
NOTICE 8364: The new partitioning scheme will produce partitions in 4 physical storage containers per projection
NOTICE 4785: Started background repartition table task
In this case, the order_date
column dates span four years. The Tuple Mover creates four partition group keys, and merges store_orders
partitions into group-specific ROS storage containers accordingly:
=> SELECT DUMP_TABLE_PARTITION_KEYS('store_orders');
...
Partition keys on node v_vmart_node0001
Projection 'store_orders_super'
Storage [ROS container]
No of partition keys: 173
Partition keys: 2017-01-02 2017-01-03 2017-01-04 ... 2017-09-25 2017-09-26 2017-09-27
Storage [ROS container]
No of partition keys: 212
Partition keys: 2016-01-01 2016-01-04 2016-01-05 ... 2016-11-23 2016-11-24 2016-11-25
Storage [ROS container]
No of partition keys: 213
Partition keys: 2015-01-01 2015-01-02 2015-01-05 ... 2015-11-23 2015-11-24 2015-11-25
2015-11-26 2015-11-27
Storage [ROS container]
No of partition keys: 211
Partition keys: 2014-01-01 2014-01-02 2014-01-03 ... 2014-11-25 2014-11-26 2014-11-27
Projection 'store_orders_super'
Storage [ROS container]
No of partition keys: 173
...
Caution
This example demonstrates how partition grouping can facilitate more efficient use of ROS storage. However, grouping all partitions into several large and static ROS containers can adversely affect performance, especially for a table that is subject to frequent DML operations. Frequent load operations in particular can incur considerable merge overhead, which, in turn, reduces performance.
Vertica recommends that you use
CALENDAR_HIERARCHY_DAY
, as a partition clause's group expression. This function automatically groups DATE
partition keys into a dynamic hierarchy of years, months, and days. Doing so helps minimize merge-related issues. For details, see Hierarchical partitioning.
Managing partitions within groups
You can use various partition management functions, such as
DROP_PARTITIONS
or
MOVE_PARTITIONS_TO_TABLE
, to target a range of order dates within a given partition group, or across multiple partition groups. In the previous example, each group contains partition keys of different dates within a given year. You can use DROP_PARTITIONS
to drop order dates that span two years, 2014 and 2015:
=> SELECT DROP_PARTITIONS('store_orders', '2014-05-30', '2015-01-15', 'true');