<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Vertica Documentation – Partitioning tables</title>
    <link>/en/admin/partitioning-tables/</link>
    <description>Recent content in Partitioning tables on Vertica Documentation</description>
    <generator>Hugo -- gohugo.io</generator>
    
	  <atom:link href="/en/admin/partitioning-tables/index.xml" rel="self" type="application/rss+xml" />
    
    
      
        
      
    
    
    <item>
      <title>Admin: Defining partitions</title>
      <link>/en/admin/partitioning-tables/defining-partitions/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/admin/partitioning-tables/defining-partitions/</guid>
      <description>
        
        
        &lt;p&gt;You can specify partitioning for new and existing tables:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;../../../en/admin/partitioning-tables/defining-partitions/partitioning-new-table/&#34;&gt;Define partitioning for a table&lt;/a&gt; with &lt;a href=&#34;../../../en/sql-reference/statements/create-statements/create-table/&#34;&gt;CREATE TABLE&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;../../../en/admin/partitioning-tables/defining-partitions/partitioning-existing-table-data/&#34;&gt;Specify partitioning for an existing table&lt;/a&gt; by modifying its definition with &lt;a href=&#34;../../../en/sql-reference/statements/alter-statements/alter-table/&#34;&gt;ALTER TABLE&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;../../../en/admin/partitioning-tables/defining-partitions/partition-grouping/&#34;&gt;Create partition groups to consolidate partitions into logical subsets&lt;/a&gt;, minimizing the use of ROS storage.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Admin: Hierarchical partitioning</title>
      <link>/en/admin/partitioning-tables/hierarchical-partitioning/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/admin/partitioning-tables/hierarchical-partitioning/</guid>
      <description>
        
        
        &lt;p&gt;The meta-function &lt;a href=&#34;../../../en/sql-reference/functions/management-functions/partition-functions/calendar-hierarchy-day/&#34;&gt;CALENDAR_HIERARCHY_DAY&lt;/a&gt; leverages &lt;a href=&#34;../../../en/admin/partitioning-tables/defining-partitions/partition-grouping/&#34;&gt;partition grouping&lt;/a&gt;. You specify this function as the partitioning &lt;code&gt;GROUP BY&lt;/code&gt; expression. CALENDAR_HIERARCHY_DAY organizes a table&#39;s date partitions into a hierarchy of groups: the oldest date partitions are grouped by year, more recent partitions are grouped by month, and the most recent date partitions remain un-grouped. Grouping is &lt;a href=&#34;#Dynamic&#34;&gt;dynamic&lt;/a&gt;: as recent data ages, the Tuple Mover merges their partitions into month groups, and eventually into year groups.&lt;/p&gt;
&lt;h2 id=&#34;managing-timestamped-data&#34;&gt;Managing timestamped data&lt;/h2&gt;
&lt;p&gt;Partition consolidation strategies are especially important for managing timestamped data, where the number of partitions can quickly escalate and risk ROS pushback. For example, the following statements create the &lt;code&gt;store_orders&lt;/code&gt; table and load data into it. The CREATE TABLE statement includes a simple &lt;a href=&#34;../../../en/sql-reference/statements/create-statements/create-table/partition-clause/&#34;&gt;partition clause&lt;/a&gt; that specifies to partition data by date:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; DROP TABLE IF EXISTS public.store_orders CASCADE;
=&amp;gt; CREATE TABLE public.store_orders
(
    order_no int,
    order_date timestamp NOT NULL,
    shipper varchar(20),
    ship_date date
)
UNSEGMENTED ALL NODES PARTITION BY order_date::DATE;
CREATE TABLE
=&amp;gt; COPY store_orders FROM &amp;#39;/home/dbadmin/export_store_orders_data.txt&amp;#39;;
41834
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;As COPY loads the new table data into ROS storage, it executes this table&#39;s partition clause by dividing daily orders into separate partitions—in this case, 809 partitions, where each partition requires its own ROS container:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT COUNT (DISTINCT ros_id) NumROS, node_name FROM PARTITIONS
    WHERE projection_name ilike &amp;#39;%store_orders_super%&amp;#39; GROUP BY node_name ORDER BY node_name;
 NumROS |    node_name
--------+------------------
    809 | v_vmart_node0001
    809 | v_vmart_node0002
    809 | v_vmart_node0003
(3 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;&lt;img src=&#34;../../../images/partitioning/partiton-by-date-ungrouped.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;This is far above the recommended maximum of 50 partitions per projection. This number is also close to the default system limit of 1024 ROS containers per projection, risking ROS pushback in the near future.&lt;/p&gt;
&lt;p&gt;You can approach this problem in several ways:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Consider consolidating table data into larger partitions—for example, partition by month instead of day. However, partitioning data at this level might limit effective use of &lt;a href=&#34;../../../en/admin/partitioning-tables/managing-partitions/&#34;&gt;partition management functions&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Regularly &lt;a href=&#34;../../../en/admin/partitioning-tables/managing-partitions/archiving-partitions/&#34;&gt;archive older partitions&lt;/a&gt;, and thereby minimize the number of accumulated partitions. However, this requires an extra layer of data management, and also inhibits access to historical data.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Alternatively, you can use CALENDAR_HIERARCHY_DAY to automatically merge partitions into a date-based hierarchy of partition groups. Each partition group is stored in its own set of ROS containers, apart from other groups. You specify this function in the table partition clause as follows:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;PARTITION BY &lt;span class=&#34;code-variable&#34;&gt;partition‑expression&lt;/span&gt;
  GROUP BY CALENDAR_HIERARCHY_DAY( &lt;span class=&#34;code-variable&#34;&gt;&lt;span class=&#34;code-variable&#34;&gt;partition‑expression&lt;/span&gt;&lt;/span&gt; [, &lt;span class=&#34;code-variable&#34;&gt;active‑months&lt;/span&gt;[, &lt;span class=&#34;code-variable&#34;&gt;active‑years&lt;/span&gt;] ] )
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;admonition important&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Important&lt;/h4&gt;
&lt;p&gt;Two requirements apply to using CALENDAR_HIERARCHY_DAY in a partition clause:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;&lt;code&gt;partition‑expression&lt;/code&gt;&lt;/em&gt; must be a &lt;a href=&#34;../../../en/sql-reference/data-types/datetime-data-types/date/&#34;&gt;DATE&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The partition expressions specified by the &lt;code&gt;PARTITION BY&lt;/code&gt; clause and CALENDAR_HIERARCHY_DAY must be identical.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;/div&gt;
&lt;p&gt;For example, given the previous table, you can repartition it as follows:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;
=&amp;gt; ALTER TABLE public.store_orders
      PARTITION BY order_date::DATE
      GROUP BY CALENDAR_HIERARCHY_DAY(order_date::DATE, 2, 2) REORGANIZE;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;&lt;a name=&#34;GroupingDateDataHierarchically&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;grouping-date-data-hierarchically&#34;&gt;Grouping DATE data hierarchically&lt;/h2&gt;
&lt;p&gt;CALENDAR_HIERARCHY_DAY creates hierarchies of partition groups, and merges partitions into the appropriate groups. It does so by evaluating the partition expression of each table row with the following algorithm, to determine its partition group key:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;GROUP BY (
CASE WHEN DATEDIFF(&amp;#39;YEAR&amp;#39;, &lt;span class=&#34;code-variable&#34;&gt;partition-expression&lt;/span&gt;, NOW()::TIMESTAMPTZ(6)) &amp;gt;= &lt;span class=&#34;code-variable&#34;&gt;active-years&lt;/span&gt;
       THEN DATE_TRUNC(&amp;#39;YEAR&amp;#39;, &lt;span class=&#34;code-variable&#34;&gt;partition-expression&lt;/span&gt;::DATE)
     WHEN DATEDIFF(&amp;#39;MONTH&amp;#39;, &lt;span class=&#34;code-variable&#34;&gt;partition-expression&lt;/span&gt;, NOW()::TIMESTAMPTZ(6)) &amp;gt;= &lt;span class=&#34;code-variable&#34;&gt;active-months&lt;/span&gt;
       THEN DATE_TRUNC(&amp;#39;MONTH&amp;#39;, &lt;span class=&#34;code-variable&#34;&gt;partition-expression&lt;/span&gt;::DATE)
     ELSE DATE_TRUNC(&amp;#39;DAY&amp;#39;, &lt;span class=&#34;code-variable&#34;&gt;partition-expression&lt;/span&gt;::DATE) END);
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;In this example, the algorithm compares &lt;code&gt;order_date&lt;/code&gt; in each &lt;code&gt;store_orders&lt;/code&gt; row to the current date as follows:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Determines whether &lt;code&gt;order_date&lt;/code&gt; is in an inactive year.&lt;/p&gt;
&lt;p&gt;If &lt;code&gt;order_date&lt;/code&gt; is in an inactive year, the row&#39;s partition group key resolves to that year. The row is merged into a ROS container for that year.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;If &lt;code&gt;order_date&lt;/code&gt; is an active year, CALENDAR_HIERARCHY_DAY evaluates &lt;code&gt;order_date&lt;/code&gt; to determine whether it is in an inactive month.&lt;/p&gt;
&lt;p&gt;If &lt;code&gt;order_date&lt;/code&gt; is in an inactive month, the row&#39;s partition group key resolves to that month. The row is merged into a ROS container for that month.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;If &lt;code&gt;order_date&lt;/code&gt; is in an active month, the row&#39;s partition group key resolves to the &lt;code&gt;order_date&lt;/code&gt; day. This row is merged into a ROS container for that day. Any rows where &lt;code&gt;order_date&lt;/code&gt; is a future date is treated in the same way.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class=&#34;admonition important&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Important&lt;/h4&gt;
&lt;p&gt;The CALENDAR_HIERARCHY_DAY &lt;a href=&#34;../../../en/admin/partitioning-tables/hierarchical-partitioning/#GroupingDateDataHierarchically&#34;&gt;algorithm&lt;/a&gt; assumes that most table activity is focused on recent dates. Setting &lt;em&gt;&lt;code&gt;active‑years&lt;/code&gt;&lt;/em&gt; and &lt;em&gt;&lt;code&gt;active‑months&lt;/code&gt;&lt;/em&gt; to a low number ≥ 2 serves to isolate most merge activity to date-specific containers, and incurs minimal overhead. Vertica recommends that you use the default setting of 2 for &lt;em&gt;&lt;code&gt;active‑years&lt;/code&gt;&lt;/em&gt; and &lt;em&gt;&lt;code&gt;active‑months&lt;/code&gt;&lt;/em&gt;. For most users, these settings achieve an optimal balance between ROS storage and performance.&lt;/p&gt;
&lt;p&gt;As a best practice, never set &lt;em&gt;&lt;code&gt;active‑years&lt;/code&gt;&lt;/em&gt; and &lt;em&gt;&lt;code&gt;active‑months&lt;/code&gt;&lt;/em&gt; to 0.&lt;/p&gt;

&lt;/div&gt;

&lt;p&gt;For example, if the current date is 2017‑09-26, CALENDAR_HIERARCHY_DAY resolves &lt;em&gt;&lt;code&gt;active‑years&lt;/code&gt;&lt;/em&gt; and &lt;em&gt;&lt;code&gt;active‑months&lt;/code&gt;&lt;/em&gt; to the following date spans:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;&lt;code&gt;active‑years&lt;/code&gt;&lt;/em&gt;: 2016-01-01 to 2017-12-31. Partitions in active years are grouped into monthly ROS containers or are merged into daily ROS containers. Partitions from earlier years are regarded as inactive and merged into yearly ROS containers.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;&lt;code&gt;active‑months&lt;/code&gt;&lt;/em&gt;: 2017-08-01 to 2017-09-30. Partitions in active months are merged into daily ROS containers.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src=&#34;../../../images/partitioning/partiton-by-date-grouped.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Now, the total number of ROS containers is reduced to 40 per projection:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT COUNT (DISTINCT ros_id) NumROS, node_name FROM PARTITIONS
    WHERE projection_name ilike &amp;#39;%store_orders_super%&amp;#39; GROUP BY node_name ORDER BY node_name;
 NumROS |    node_name
--------+------------------
     40 | v_vmart_node0001
     40 | v_vmart_node0002
     40 | v_vmart_node0003
(3 rows)
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

Regardless of how the Tuple Mover groups and merges partitions, it always identifies one or more partitions or partition groups as active. For details, see &lt;a href=&#34;../../../en/admin/partitioning-tables/active-and-inactive-partitions/&#34;&gt;Active and inactive partitions&lt;/a&gt;.

&lt;/div&gt;
&lt;p&gt;&lt;a name=&#34;Dynamic&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;dynamic-regrouping&#34;&gt;Dynamic regrouping&lt;/h2&gt;
&lt;p&gt;As shown earlier, CALENDAR_HIERARCHY_DAY references the current date when it creates partition group keys and merges partitions. As the calendar advances, the Tuple Mover reevaluates the partition group keys of tables that are partitioned with this function, and moves partitions as needed to different ROS containers.&lt;/p&gt;
&lt;p&gt;Thus, given the previous example, on 2017-10-01 the Tuple Mover creates a monthly ROS container for August partitions. All partition keys between 2017-08-01 and 2017-08-31 are merged into the new ROS container 2017-08:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;../../../images/partitioning/partiton-by-date-grouped-new-month.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Likewise, on 2018-01-01, the Tuple Mover creates a ROS container for 2016 partitions. All partition keys between 2016-01-01 and 2016-12-31 that were previously grouped by month are merged into the new yearly ROS container:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;../../../images/partitioning/partiton-by-date-grouped-new-year.png&#34; alt=&#34;&#34;&gt;

&lt;div class=&#34;admonition caution&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Caution&lt;/h4&gt;

After older partitions are grouped into months and years, any partition operation that acts on a subset of older partition groups is liable to split ROS containers into smaller ROS containers for each partition—for example, &lt;a href=&#34;../../../en/sql-reference/functions/management-functions/partition-functions/move-partitions-to-table/&#34;&gt;MOVE_PARTITIONS_TO_TABLE&lt;/a&gt;, where &lt;em&gt;&lt;code&gt;force‑split&lt;/code&gt;&lt;/em&gt; is set to true. These operations can lead to ROS pushback. If you anticipate frequent partition operations on hierarchically grouped partitions, &lt;a href=&#34;#CustomizingHierarchicalExpressions&#34;&gt;consider modifying the partition expression&lt;/a&gt; so partitions are grouped no higher than months.

&lt;/div&gt;&lt;/p&gt;
&lt;p&gt;&lt;a name=&#34;CustomizingHierarchicalExpressions&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;customizing-partition-group-hierarchies&#34;&gt;Customizing partition group hierarchies&lt;/h2&gt;
&lt;p&gt;Vertica provides a single function, CALENDAR_HIERARCHY_DAY, to facilitate hierarchical partitioning. Vertica stores the &lt;code&gt;GROUP BY&lt;/code&gt; clause as a CASE statement that you can edit to suit your own requirements.&lt;/p&gt;
&lt;p&gt;For example, Vertica stores the &lt;code&gt;store_orders&lt;/code&gt; partition clause as follows:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; ALTER TABLE public.store_orders
      PARTITION BY order_date::DATE
      GROUP BY CALENDAR_HIERARCHY_DAY(order_date::DATE, 2, 2);
=&amp;gt; select export_tables(&amp;#39;&amp;#39;,&amp;#39;store_orders&amp;#39;);
...
CREATE TABLE public.store_orders ( ... )

PARTITION BY ((store_orders.order_date)::date)
GROUP BY (
CASE WHEN (&amp;#34;datediff&amp;#34;(&amp;#39;year&amp;#39;, (store_orders.order_date)::date, ((now())::timestamptz(6))::date) &amp;gt;= 2)
       THEN (date_trunc(&amp;#39;year&amp;#39;, (store_orders.order_date)::date))::date
     WHEN (&amp;#34;datediff&amp;#34;(&amp;#39;month&amp;#39;, (store_orders.order_date)::date, ((now())::timestamptz(6))::date) &amp;gt;= 2)
       THEN (date_trunc(&amp;#39;month&amp;#39;, (store_orders.order_date)::date))::date
     ELSE (store_orders.order_date)::date END);
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;You can modify the CASE statement to customize the hierarchy of partition groups. For example, the following CASE statement creates a hierarchy of months, days, and hours:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; ALTER TABLE store_orders
PARTITION BY (store_orders.order_date)
GROUP BY (
CASE WHEN DATEDIFF(&amp;#39;MONTH&amp;#39;, store_orders.order_date, NOW()::TIMESTAMPTZ(6)) &amp;gt;= 2
       THEN DATE_TRUNC(&amp;#39;MONTH&amp;#39;, store_orders.order_date::DATE)
     WHEN DATEDIFF(&amp;#39;DAY&amp;#39;, store_orders.order_date, NOW()::TIMESTAMPTZ(6)) &amp;gt;= 2
       THEN DATE_TRUNC(&amp;#39;DAY&amp;#39;, store_orders.order_date::DATE)
     ELSE DATE_TRUNC(&amp;#39;hour&amp;#39;, store_orders.order_date::DATE) END);
&lt;/code&gt;&lt;/pre&gt;
      </description>
    </item>
    
    <item>
      <title>Admin: Partitioning and segmentation</title>
      <link>/en/admin/partitioning-tables/partitioning-and-segmentation/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/admin/partitioning-tables/partitioning-and-segmentation/</guid>
      <description>
        
        
        &lt;p&gt;In Vertica, partitioning and segmentation are separate concepts and achieve different goals to localize data:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Segmentation&lt;/strong&gt; refers to organizing and distributing data across cluster nodes for fast data purges and query performance. Segmentation aims to distribute data evenly across multiple database nodes so all nodes participate in query execution. You specify segmentation with the 
&lt;code&gt;&lt;a href=&#34;../../../en/sql-reference/statements/create-statements/create-projection/&#34;&gt;CREATE PROJECTION&lt;/a&gt;&lt;/code&gt; statement&#39;s &lt;a href=&#34;../../../en/sql-reference/statements/create-statements/create-projection/hash-segmentation-clause/&#34;&gt;hash segmentation clause&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Partitioning&lt;/strong&gt; specifies how to organize data within individual nodes for distributed computing. Node partitions let you easily identify data you wish to drop and help reclaim disk space. You specify partitioning with the 
&lt;code&gt;&lt;a href=&#34;../../../en/sql-reference/statements/create-statements/create-table/&#34;&gt;CREATE TABLE&lt;/a&gt;&lt;/code&gt; statement&#39;s &lt;code&gt;PARTITION BY&lt;/code&gt; clause.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
For example: partitioning data by year makes sense for retaining and dropping annual data. However, segmenting the same data by year would be inefficient, because the node holding data for the current year would likely answer far more queries than the other nodes.&lt;/p&gt;
&lt;p&gt;The following diagram illustrates the flow of segmentation and partitioning on a four-node database cluster:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Example table data&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Data segmented by &lt;code&gt;HASH(order_id)&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Data segmented by hash across four nodes&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Data partitioned by year on a single node&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;While partitioning occurs on all four nodes, the illustration shows partitioned data on one node for simplicity.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;../../../images/partitioning/datapart5.png&#34; alt=&#34;data partition vs segmentation&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;see-also&#34;&gt;See also&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/admin/managing-db/managing-disk-space/reclaiming-disk-space-from-deleted-table-data/&#34;&gt;Reclaiming disk space from deleted table data&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/data-analysis/query-optimization/join-queries/identical-segmentation/&#34;&gt;Identical segmentation&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/admin/projections/segmented-projections/&#34;&gt;Segmented projections&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/admin/projections/unsegmented-projections/&#34;&gt;Unsegmented projections&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/sql-reference/statements/create-statements/create-projection/&#34;&gt;CREATE PROJECTION&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/sql-reference/statements/create-statements/create-table/&#34;&gt;CREATE TABLE&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Admin: Managing partitions</title>
      <link>/en/admin/partitioning-tables/managing-partitions/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/admin/partitioning-tables/managing-partitions/</guid>
      <description>
        
        
        &lt;p&gt;You can manage partitions with the following operations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;../../../en/admin/partitioning-tables/managing-partitions/dropping-partitions/&#34;&gt;Drop partitions&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;../../../en/admin/partitioning-tables/managing-partitions/archiving-partitions/&#34;&gt;Archive partitions&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;../../../en/admin/partitioning-tables/managing-partitions/swapping-partitions/&#34;&gt;Swap partitions&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;../../../en/admin/partitioning-tables/managing-partitions/minimizing-partitions/&#34;&gt;Minimize partitions&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;../../../en/admin/partitioning-tables/managing-partitions/viewing-partition-storage-data/&#34;&gt;View partition storage&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Admin: Active and inactive partitions</title>
      <link>/en/admin/partitioning-tables/active-and-inactive-partitions/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/admin/partitioning-tables/active-and-inactive-partitions/</guid>
      <description>
        
        
        &lt;p&gt;The Tuple Mover assumes that all loads and updates to a partitioned table are targeted to one or more partitions that it identifies as &lt;em&gt;active&lt;/em&gt;. In general, the partitions with the largest partition keys—typically, the most recently created partitions—are regarded as active. As the partition ages, its workload typically shrinks and becomes mostly read-only.&lt;/p&gt;

&lt;h2 id=&#34;setting-active-partition-count&#34;&gt;Setting active partition count&lt;/h2&gt;
&lt;p&gt;You can specify how many partitions are active for partitioned tables at two levels, in ascending order of precedence:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Configuration parameter &lt;a href=&#34;../../../en/sql-reference/config-parameters/tuple-mover-parameters/&#34;&gt;ActivePartitionCount&lt;/a&gt; determines how many partitions are active for partitioned tables in the database. By default, ActivePartitionCount is set to 1. The Tuple Mover applies this setting to all tables that do not set their own active partition count.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Individual tables can supersede ActivePartitionCount by setting their own active partition count with &lt;a href=&#34;../../../en/sql-reference/statements/create-statements/create-table/&#34;&gt;CREATE TABLE&lt;/a&gt; and &lt;a href=&#34;../../../en/sql-reference/statements/alter-statements/alter-table/&#34;&gt;ALTER TABLE&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Partitioned tables in the same database can be subject to different distributions of update and load activity. When these differences are significant, it might make sense for some tables to set their own active partition counts.&lt;/p&gt;
&lt;p&gt;For example, table &lt;code&gt;store_orders&lt;/code&gt; is partitioned by month and gets its active partition count from configuration parameter &lt;code&gt;ActivePartitionCount&lt;/code&gt;. If the parameter is set to 1, the Tuple Mover identifes the latest month—typically, the current one—as the table&#39;s active partition. If &lt;code&gt;store_orders&lt;/code&gt; is subject to frequent activity on data for the current month and the one before it, you might want the table to supersede the configuration parameter, and set its active partition count to 2:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;ALTER TABLE public.store_orders SET ACTIVEPARTITIONCOUNT 2;
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

For tables partitioned by non-temporal attributes, set its active partition count to reflect the number of partitions that are subject to a high level of activity—for example, frequent loads or queries.

&lt;/div&gt;
&lt;p&gt;&lt;a name=&#34;Identify&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;identifying-the-active-partition&#34;&gt;Identifying the active partition&lt;/h2&gt;
&lt;p&gt;The Tuple Mover typically identifies the active partition as the one most recently created. Vertica uses the following algorithm to determine which partitions are older than others:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;If partition X was created before partition Y, partition X is older.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;If partitions X and Y were created at the same time, but partition X was last updated before partition Y, partition X is older.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;If partitions X and Y were created and last updated at the same time, the partition with the smaller key is older.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;You can obtain the active partitions for a table by joining system tables 
&lt;code&gt;&lt;a href=&#34;../../../en/sql-reference/system-tables/v-monitor-schema/partitions/&#34;&gt;PARTITIONS&lt;/a&gt;&lt;/code&gt; and 
&lt;code&gt;&lt;a href=&#34;../../../en/sql-reference/system-tables/v-monitor-schema/strata/&#34;&gt;STRATA&lt;/a&gt;&lt;/code&gt; and querying on its projections. For example, the following query gets the active partition for projection &lt;code&gt;store_orders_super&lt;/code&gt;:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT p.node_name, p.partition_key, p.ros_id, p.ros_size_bytes, p.ros_row_count, ROS_container_count
     FROM partitions p JOIN strata s ON p.partition_key = s.stratum_key AND p.node_name=s.node_name
     WHERE p.projection_name = &amp;#39;store_orders_super&amp;#39; ORDER BY p.node_name, p.partition_key;
    node_name     | partition_key |      ros_id       | ros_size_bytes | ros_row_count | ROS_container_count
------------------+---------------+-------------------+----------------+---------------+---------------------
 v_vmart_node0001 | 2017-09-01    | 45035996279322851 |           6905 |           960 |                   1
 v_vmart_node0002 | 2017-09-01    | 49539595906590663 |           6905 |           960 |                   1
 v_vmart_node0003 | 2017-09-01    | 54043195533961159 |           6905 |           960 |                   1
(3 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;active-partition-groups&#34;&gt;Active partition groups&lt;/h2&gt;
&lt;p&gt;If a table&#39;s partition clause includes a &lt;code&gt;GROUP BY&lt;/code&gt; expression, Vertica applies the table&#39;s active partition count to its largest partition group key, and regards all the partitions in that group as active. If you group partitions with Vertica meta-function 
&lt;code&gt;&lt;a href=&#34;../../../en/sql-reference/functions/management-functions/partition-functions/calendar-hierarchy-day/&#34;&gt;CALENDAR_HIERARCHY_DAY&lt;/a&gt;&lt;/code&gt;, the most recent date partitions are also grouped by day. Thus, the largest partition group key and largest partition key are identical. In effect, this means that only the most recent partitions are active.&lt;/p&gt;
&lt;p&gt;For more information about partition grouping, see &lt;a href=&#34;../../../en/admin/partitioning-tables/defining-partitions/partition-grouping/&#34;&gt;Partition grouping&lt;/a&gt; and &lt;a href=&#34;../../../en/admin/partitioning-tables/hierarchical-partitioning/&#34;&gt;Hierarchical partitioning&lt;/a&gt;.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Admin: Partition pruning</title>
      <link>/en/admin/partitioning-tables/partition-pruning/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/admin/partitioning-tables/partition-pruning/</guid>
      <description>
        
        
        &lt;p&gt;If a query predicate specifies a partitioning expression, the query optimizer evaluates the predicate against the &lt;a class=&#34;glosslink&#34; href=&#34;../../../en/glossary/ros-read-optimized-store/&#34; title=&#34;Read Optimized Store (ROS) is a highly optimized, read-oriented, disk storage structure, organized by projection.&#34;&gt;ROS&lt;/a&gt; containers of the partitioned data. Each ROS container maintains the minimum and maximum values of its partition key data. The query optimizer uses this metadata to determine which ROS containers it needs to execute the query, and omits, or &lt;em&gt;prunes&lt;/em&gt;, the remaining containers from the query plan. By minimizing the number of ROS containers that it must scan, the query optimizer enables faster execution of the query.&lt;/p&gt;
&lt;p&gt;For example, a table might be partitioned by year as follows:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; CREATE TABLE ... PARTITION BY EXTRACT(year FROM date);
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Given this table definition, its projection data is partitioned into ROS containers according to year, one for each year—in this case, 2007, 2008, 2009.&lt;/p&gt;
&lt;p&gt;The following query specifies the partition expression &lt;code&gt;date&lt;/code&gt;:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;
=&amp;gt; SELECT ... WHERE date = &amp;#39;12-2-2009&amp;#39;;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Given this query, the ROS containers that contain data for 2007 and 2008 fall outside the boundaries of the requested year (2009). The query optimizer prunes these containers from the query plan before the query executes:&lt;/p&gt;


&lt;h2 id=&#34;examples&#34;&gt;Examples&lt;/h2&gt;
&lt;p&gt;Assume a table that is partitioned by time and will use queries that restrict data on time.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; CREATE TABLE time ( tdate DATE NOT NULL, tnum INTEGER)
     PARTITION BY EXTRACT(year FROM tdate);
=&amp;gt; CREATE PROJECTION time_p (tdate, tnum) AS
=&amp;gt; SELECT * FROM time ORDER BY tdate, tnum UNSEGMENTED ALL NODES;
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

Projection sort order has no effect on partition pruning.

&lt;/div&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; INSERT INTO time VALUES (&amp;#39;03/15/04&amp;#39; , 1);
=&amp;gt; INSERT INTO time VALUES (&amp;#39;03/15/05&amp;#39; , 2);
=&amp;gt; INSERT INTO time VALUES (&amp;#39;03/15/06&amp;#39; , 3);
=&amp;gt; INSERT INTO time VALUES (&amp;#39;03/15/06&amp;#39; , 4);
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The data inserted in the previous series of commands are loaded into three ROS containers, one per year, as that is how the data is partitioned:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT * FROM time ORDER BY tnum;
   tdate    | tnum
------------+------
 2004-03-15 |    1  --ROS1 (min 03/01/04, max 03/15/04)
 2005-03-15 |    2  --ROS2 (min 03/15/05, max 03/15/05)
 2006-03-15 |    3  --ROS3 (min 03/15/06, max 03/15/06)
 2006-03-15 |    4  --ROS3 (min 03/15/06, max 03/15/06)
(4 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Here&#39;s what happens when you query the &lt;code&gt;time&lt;/code&gt; table:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;In this query, Vertica can omit container ROS2 because it is only looking for year 2004:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT COUNT(*) FROM time WHERE tdate = &amp;#39;05/07/2004&amp;#39;;
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the next query, Vertica can omit two containers, ROS1 and ROS3:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT COUNT(*) FROM time WHERE tdate = &amp;#39;10/07/2005&amp;#39;;
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The following query has an additional predicate on the &lt;code&gt;tnum&lt;/code&gt; column for which no minimum/maximum values are maintained. In addition, the use of logical operator OR is not supported, so no ROS elimination occurs:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT COUNT(*) FROM time WHERE tdate = &amp;#39;05/07/2004&amp;#39; OR tnum = 7;
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
  </channel>
</rss>
