Analytic function path

Vertica attempts to optimize multiple SQL-99 Analytic Functions from the same query by grouping them together in Analytic Group areas.

Vertica attempts to optimize multiple SQL-99 Analytic functions from the same query by grouping them together in Analytic Group areas.

For each analytical group, Vertica performs a distributed sort and resegment of the data, if necessary.

You can tell how many sorts and resegments are required based on the query plan.

For example, the following query plan shows that the FIRST_VALUE and LAST_VALUE functions are in the same analytic group because their OVER clause is the same. In contrast, ROW_NUMBER() has a different ORDER BY clause, so it is in a different analytic group. Because both groups share the same PARTITION BY deal_stage clause, the data does not need to be resegmented between groups :

EXPLAIN SELECT
  first_value(deal_size) OVER (PARTITION BY deal_stage
  ORDER BY deal_size),
  last_value(deal_size) OVER (PARTITION BY deal_stage
  ORDER BY deal_size),
  row_number() OVER (PARTITION BY deal_stage
  ORDER BY largest_bill_amount)
  FROM customer_dimension;

 Access Path:
 +-ANALYTICAL [Cost: 1K, Rows: 50K] (PATH ID: 1)
 |  Analytic Group
 |   Functions: row_number()
 |   Group Sort: customer_dimension.deal_stage ASC, customer_dimension.largest_bill_amount ASC NULLS LAST
 |  Analytic Group
 |   Functions: first_value(), last_value()
 |   Group Filter: customer_dimension.deal_stage
 |   Group Sort: customer_dimension.deal_stage ASC, customer_dimension.deal_size ASC NULL LAST
 |  Execute on: All Nodes
 | +---> STORAGE ACCESS for customer_dimension [Cost: 263, Rows: 50K]
         (PATH ID: 2)
 | |      Projection: public.customer_dimension_DBD_1_rep_vmart_vmart_node0001
 | |      Materialize: customer_dimension.largest_bill_amount,
          customer_dimension.deal_stage, customer_dimension.deal_size
 | |      Execute on: All Nodes

See also

Invoking analytic functions