<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>OpenText Analytics Database 26.2.x – Data aggregation</title>
    <link>/en/data-analysis/data-aggregation/</link>
    <description>Recent content in Data aggregation on OpenText Analytics Database 26.2.x</description>
    <generator>Hugo -- gohugo.io</generator>
    
	  <atom:link href="/en/data-analysis/data-aggregation/index.xml" rel="self" type="application/rss+xml" />
    
    
      
        
      
    
    
    <item>
      <title>Data-Analysis: Single-level aggregation</title>
      <link>/en/data-analysis/data-aggregation/single-level-aggregation/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/data-analysis/data-aggregation/single-level-aggregation/</guid>
      <description>
        
        
        &lt;p&gt;The simplest &lt;code&gt;GROUP BY&lt;/code&gt; queries aggregate data at a single level. For example, a table might contain the following information about family expenses:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Category&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Amount spent on that category during the year&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Year&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Table data might look like this:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT * FROM expenses ORDER BY Category;
 Year |  Category  | Amount
------+------------+--------
2005  | Books      |  39.98
2007  | Books      |  29.99
2008  | Books      |  29.99
2006  | Electrical | 109.99
2005  | Electrical | 109.99
2007  | Electrical | 229.98
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;You can use aggregate functions to get the total expenses per category or per year:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT SUM(Amount), Category FROM expenses GROUP BY Category;
 SUM     | Category
---------+------------
  99.96  | Books
 449.96  | Electrical
=&amp;gt; SELECT SUM(Amount), Year FROM expenses GROUP BY Year;
 SUM    | Year
--------+------
 149.97 | 2005
 109.99 | 2006
  29.99 | 2008
 259.97 | 2007
&lt;/code&gt;&lt;/pre&gt;
      </description>
    </item>
    
    <item>
      <title>Data-Analysis: Multi-level aggregation</title>
      <link>/en/data-analysis/data-aggregation/multi-level-aggregation/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/data-analysis/data-aggregation/multi-level-aggregation/</guid>
      <description>
        
        
        &lt;p&gt;Over time, tables that are updated frequently can contain large amounts of data. Using the simple table shown earlier, suppose you want a multilevel query, like the number of expenses per category per year.&lt;/p&gt;
&lt;p&gt;The following query uses the &lt;code&gt;ROLLUP&lt;/code&gt; aggregation with the SUM function to calculate the total expenses by category and the overall expenses total. The NULL fields indicate subtotal values in the aggregation.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;When only the &lt;code&gt;Year&lt;/code&gt; column is &lt;code&gt;NULL&lt;/code&gt;, the subtotal is for all the &lt;code&gt;Category&lt;/code&gt; values.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;When both the &lt;code&gt;Year&lt;/code&gt; and &lt;code&gt;Category&lt;/code&gt; columns are &lt;code&gt;NULL&lt;/code&gt;, the subtotal is for all &lt;code&gt;Amount&lt;/code&gt; values for both columns.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Using the &lt;code&gt;ORDER BY&lt;/code&gt; clause orders the results by expense category, the year the expenses took place, and the &lt;code&gt;GROUP BY&lt;/code&gt; level that the &lt;code&gt;GROUPING_ID&lt;/code&gt; function creates:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT Category, Year, SUM(Amount) FROM expenses
   GROUP BY ROLLUP(Category, Year) ORDER BY Category, Year, GROUPING_ID();
 Category   | Year |  SUM
------------+------+--------
 Books      | 2005 |  39.98
 Books      | 2007 |  29.99
 Books      | 2008 |  29.99
 Books      |      |  99.96
 Electrical | 2005 | 109.99
 Electrical | 2006 | 109.99
 Electrical | 2007 | 229.98
 Electrical |      | 449.96
            |      | 549.92
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Similarly, the following query calculates the total sales by year and the overall sales total and then uses the &lt;code&gt;ORDER BY&lt;/code&gt; clause to sort the results:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT Category, Year, SUM(Amount) FROM expenses
   GROUP BY ROLLUP(Year, Category) ORDER BY 2, 1, GROUPING_ID();
  Category  | Year |  SUM
------------+------+--------
 Books      | 2005 |  39.98
 Electrical | 2005 | 109.99
            | 2005 | 149.97
 Electrical | 2006 | 109.99
            | 2006 | 109.99
 Books      | 2007 |  29.99
 Electrical | 2007 | 229.98
            | 2007 | 259.97
 Books      | 2008 |  29.99
            | 2008 |  29.99
            |      | 549.92
(11 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;You can use the &lt;code&gt;CUBE&lt;/code&gt; aggregate to perform all possible groupings of the category and year expenses. The following query returns all possible groupings, ordered by grouping:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT Category, Year, SUM(Amount) FROM expenses
   GROUP BY CUBE(Category, Year) ORDER BY 1, 2, GROUPING_ID();
 Category   | Year |  SUM
------------+------+--------
 Books      | 2005 |  39.98
 Books      | 2007 |  29.99
 Books      | 2008 |  29.99
 Books      |      |  99.96
 Electrical | 2005 | 109.99
 Electrical | 2006 | 109.99
 Electrical | 2007 | 229.98
 Electrical |      | 449.96
            | 2005 | 149.97
            | 2006 | 109.99
            | 2007 | 259.97
            | 2008 |  29.99
            |      | 549.92
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The results include subtotals for each category and each year and a total ($549.92) for all transactions, regardless of year or category.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;ROLLUP&lt;/code&gt;, &lt;code&gt;CUBE&lt;/code&gt;, and &lt;code&gt;GROUPING SETS&lt;/code&gt; generate &lt;code&gt;NULL&lt;/code&gt; values in grouping columns to identify subtotals. If table data includes &lt;code&gt;NULL&lt;/code&gt; values, differentiating these from &lt;code&gt;NULL&lt;/code&gt; values in subtotals can sometimes be challenging.&lt;/p&gt;
&lt;p&gt;In the preceding output, the &lt;code&gt;NULL&lt;/code&gt; values in the &lt;code&gt;Year&lt;/code&gt; column indicate that the row was grouped on the &lt;code&gt;Category&lt;/code&gt; column, rather than on both columns. In this case, &lt;code&gt;ROLLUP&lt;/code&gt; added the &lt;code&gt;NULL&lt;/code&gt; value to indicate the subtotal row.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Data-Analysis: Aggregates and functions for multilevel grouping</title>
      <link>/en/data-analysis/data-aggregation/aggregates-and-functions-multilevel-grouping/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/data-analysis/data-aggregation/aggregates-and-functions-multilevel-grouping/</guid>
      <description>
        
        
        &lt;p&gt;OpenText™ Analytics Database provides several aggregates and functions that group the results of a GROUP BY query at multiple levels.&lt;/p&gt;
&lt;h2 id=&#34;aggregates-for-multilevel-grouping&#34;&gt;Aggregates for multilevel grouping&lt;/h2&gt;
&lt;p&gt;Use the following aggregates for multilevel grouping:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;
&lt;code&gt;&lt;a href=&#34;../../../en/sql-reference/statements/select/group-by-clause/rollup-aggregate/#&#34;&gt;ROLLUP&lt;/a&gt;&lt;/code&gt; automatically performs subtotal aggregations. ROLLUP performs one or more aggregations across multiple dimensions, at different levels.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;
&lt;code&gt;&lt;a href=&#34;../../../en/sql-reference/statements/select/group-by-clause/cube-aggregate/#&#34;&gt;CUBE&lt;/a&gt;&lt;/code&gt; performs the aggregation for all permutations of the CUBE expression that you specify.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;
&lt;code&gt;&lt;a href=&#34;../../../en/sql-reference/statements/select/group-by-clause/grouping-sets-aggregate/#&#34;&gt;GROUPING SETS&lt;/a&gt;&lt;/code&gt; let you specify which groupings of aggregations you need.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can use CUBE or ROLLUP expressions inside GROUPING SETS expressions. Otherwise, you cannot nest multilevel aggregate expressions.&lt;/p&gt;
&lt;h2 id=&#34;grouping-functions&#34;&gt;Grouping functions&lt;/h2&gt;
&lt;p&gt;You use one of the following three grouping functions with ROLLUP, CUBE, and GROUPING SETS:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;../../../en/sql-reference/functions/aggregate-functions/group-id/&#34;&gt;GROUP_ID&lt;/a&gt; returns one or more numbers, starting with zero (0), to uniquely identify duplicate sets.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;../../../en/sql-reference/functions/aggregate-functions/grouping-id/&#34;&gt;GROUPING_ID&lt;/a&gt; produces a unique ID for each grouping combination.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;../../../en/sql-reference/functions/aggregate-functions/grouping/&#34;&gt;GROUPING&lt;/a&gt; identifies for each grouping combination whether a column is a part of this grouping. This function also differentiates NULL values in the data from NULL grouping subtotals.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These functions are typically used with multilevel aggregates.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Data-Analysis: Aggregate expressions for GROUP BY</title>
      <link>/en/data-analysis/data-aggregation/aggregate-expressions-group-by/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/data-analysis/data-aggregation/aggregate-expressions-group-by/</guid>
      <description>
        
        
        &lt;p&gt;You can include CUBE and ROLLUP aggregates within a GROUPING SETS aggregate. Be aware that the CUBE and ROLLUP aggregates can result in a large amount of output. However, you can avoid large outputs by using GROUPING SETS to return only specified results.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;...GROUP BY a,b,c,d,ROLLUP(a,b)...
...GROUP BY a,b,c,d,CUBE((a,b),c,d)...
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;You cannot include any aggregates in a CUBE or ROLLUP aggregate expression.&lt;/p&gt;
&lt;p&gt;You can append multiple GROUPING SETS, CUBE, or ROLLUP aggregates in the same query.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;...GROUP BY a,b,c,d,CUBE(a,b),ROLLUP (c,d)...
...GROUP BY a,b,c,d,GROUPING SETS ((a,d),(b,c),CUBE(a,b));...
...GROUP BY a,b,c,d,GROUPING SETS ((a,d),(b,c),(a,b),(a),(b),())...
&lt;/code&gt;&lt;/pre&gt;
      </description>
    </item>
    
    <item>
      <title>Data-Analysis: Pre-aggregating data in projections</title>
      <link>/en/data-analysis/data-aggregation/pre-aggregating-data-projections/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/data-analysis/data-aggregation/pre-aggregating-data-projections/</guid>
      <description>
        
        
        &lt;p&gt;Queries that use aggregate functions such as &lt;a href=&#34;../../../en/sql-reference/functions/aggregate-functions/sum-aggregate/&#34;&gt;SUM&lt;/a&gt; and &lt;a href=&#34;../../../en/sql-reference/functions/aggregate-functions/count-aggregate/&#34;&gt;COUNT&lt;/a&gt; can perform more efficiently when they use projections that already contain the aggregated data. This improved efficiency is especially true for queries on large quantities of data.&lt;/p&gt;
&lt;p&gt;For example, a power grid company reads 30 million smart meters that provide data at five-minute intervals. The company records each reading in a database table. Over a given year, three trillion records are added to this table.&lt;/p&gt;
&lt;p&gt;The power grid company can analyze these records with queries that include aggregate functions to perform the following tasks:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Establish usage patterns.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Detect fraud.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Measure correlation to external events such as weather patterns or pricing changes.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To optimize query response time, you can create an aggregate projection, which stores the data is stored after it is aggregated.&lt;/p&gt;
&lt;h2 id=&#34;aggregate-projections&#34;&gt;Aggregate projections&lt;/h2&gt;
&lt;p&gt;Vertica provides several types of projections for storing data that is returned from aggregate functions or expressions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;../../../en/data-analysis/data-aggregation/pre-aggregating-data-projections/live-aggregate-projections/&#34;&gt;Live aggregate projection:&lt;/a&gt; Projection that contains columns with values that are aggregated from columns in its anchor table. You can also define live aggregate projections that include &lt;a href=&#34;../../../en/data-analysis/data-aggregation/pre-aggregating-data-projections/pre-aggregating-udtf-results/&#34;&gt;user-defined transform functions&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;../../../en/data-analysis/data-aggregation/pre-aggregating-data-projections/top-k-projections/&#34;&gt;Top-K projection&lt;/a&gt;: Type of live aggregate projection that returns the top &lt;em&gt;&lt;code&gt;k&lt;/code&gt;&lt;/em&gt; rows from a partition of selected rows. Create a Top-K projection that satisfies the criteria for a Top-K query.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;../../../en/data-analysis/data-aggregation/pre-aggregating-data-projections/pre-aggregating-udtf-results/&#34;&gt;Projection that pre-aggregates UDTF results&lt;/a&gt;: Live aggregate projection that invokes user-defined transform functions (UDTFs). To minimize overhead when you query those projections of this type, Vertica processes the UDTF functions in the background and stores their results on disk.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;../../../en/data-analysis/data-aggregation/pre-aggregating-data-projections/aggregating-data-through-expressions/&#34;&gt;Projection that contains expressions&lt;/a&gt;: Projection with columns whose values are calculated from anchor table columns.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&#34;recommended-use&#34;&gt;Recommended use&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Aggregate projections are most useful for queries against large sets of data.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;For optimal query performance, the size of LAP projections should be a small subset of the anchor table—ideally, between 1 and 10 percent of the anchor table, or smaller, if possible.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;restrictions&#34;&gt;Restrictions&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;MERGE operations must be &lt;a href=&#34;../../../en/admin/working-with-native-tables/merging-table-data/merge-optimization/&#34;&gt;optimized&lt;/a&gt; if they are performed on target tables that have live aggregate projections.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;You cannot update or delete data in temporary tables with live aggregate projections.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;requirements&#34;&gt;Requirements&lt;/h2&gt;
&lt;p&gt;In the event of manual recovery from an unclean database shutdown, live aggregate projections might require some time to refresh.&lt;/p&gt;

      </description>
    </item>
    
  </channel>
</rss>
