This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

WITH clause

A WITH clause defines one or more named common table expressions (CTEs), where each CTE encapsulates a result set that can be referenced by another CTE in the same WITH clause, or by the primary query.

A WITH clause defines one or more named common table expressions (CTEs), where each CTE encapsulates a result set that can be referenced by another CTE in the same WITH clause, or by the primary query. Vertica can evaluate WITH clauses in two ways:

  • Inline expansion (default): Vertica evaluates each WITH clause every time it is referenced by the primary query.
  • Materialization: Vertica evaluates each WITH clause once, stores results in a temporary table, and references this table as often as the query requires.

In both cases, WITH clauses can help simplify complicated queries and avoid statement repetition.

Syntax

WITH [ /*+ENABLE_WITH_CLAUSE_MATERIALIZATION */ ] [ RECURSIVE ] {
   cte-identifier [ ( column-aliases ) ] AS (
   [ subordinate-WITH-clause ]
   query-expression )
} [,...]

Arguments

/*+ENABLE_WITH_CLAUSE_MATERIALIZATION*/
Enables materialization of all queries in the current WITH clause. Otherwise, materialization is set by configuration parameter WithClauseMaterialization, by default set to 0 (disabled). If WithClauseMaterialization is disabled, materialization is automatically cleared when the primary query of the WITH clause returns. For details, see Materialization of WITH clause.
RECURSIVE
Specifies to iterate over the WITH clause's own result set, through repeated execution of an embedded UNION or UNION ALL statement. For details, see WITH clause recursion.
cte-identifier
Identifies a common table expression (CTE) within a WITH clause. This identifier is available to CTEs of the same WITH clause, and of parent and child WITH clauses (if any). CTE identifiers of the outermost (primary) WITH clause are also available to the primary query.

All CTE identifiers of the same WITH clause must be unique. For example, the following WITH clause defines two CTEs, so they require unique identifiers: regional_sales and top_regions:

  
WITH
-- query sale amounts for each region
   regional_sales AS (SELECT ... ),
   top_regions AS ( SELECT ... )
   )
column-aliases
A comma-delimited list of result set column aliases. The list of aliases must map to all column expressions in the CTE query. If omitted, result set columns can only be referenced by the names used in the query.

In the following example, the revenue CTE specifies two column aliases: vkey and total_revenue. These map to column vendor_key and aggregate expression SUM(total_order_cost), respectively. The primary query references these aliases:

  
WITH revenue ( vkey, total_revenue ) AS (
   SELECT vendor_key, SUM(total_order_cost)
   FROM store.store_orders_fact
   GROUP BY vendor_key ORDER BY vendor_key)
  
SELECT v.vendor_name, v.vendor_address, v.vendor_city, r.total_revenue
FROM vendor_dimension v JOIN revenue r ON v.vendor_key = r.vkey
WHERE r.total_revenue = (SELECT MAX(total_revenue) FROM revenue )
ORDER BY vendor_name;
subordinate-WITH-clause
A WITH clause that is nested within the current one. CTEs of this WITH clause can only reference CTEs of the same clause, and of parent and child WITH clauses.
query-expression
The query of a given CTE.

Restrictions

WITH clauses only support SELECT and INSERT statements. They do not support UPDATE or DELETE statements.

Examples

Single WITH clause with single CTE

The following SQL defines a WITH clause with one CTE, revenue, which aggregates data in table store.store_orders_fact. The primary query references the WITH clause result set twice: in its JOIN clause and predicate:

-- define WITH clause
WITH revenue ( vkey, total_revenue ) AS (
      SELECT vendor_key, SUM(total_order_cost)
      FROM store.store_orders_fact
      GROUP BY vendor_key ORDER BY 1)
-- End WITH clause

-- primary query
SELECT v.vendor_name, v.vendor_address, v.vendor_city, r.total_revenue
FROM vendor_dimension v JOIN revenue r ON v.vendor_key = r.vkey
WHERE r.total_revenue = (SELECT MAX(total_revenue) FROM revenue )
ORDER BY vendor_name;
   vendor_name    | vendor_address | vendor_city | total_revenue
------------------+----------------+-------------+---------------
 Frozen Suppliers | 471 Mission St | Peoria      |      49877044
(1 row)

Single WITH clause and multiple CTEs

In the following example, the WITH clause contains two CTEs:

  • regional_sales totals sales for each region

  • top_regions uses the result set from regional_sales to identify the three regions with the highest sales:

The primary query aggregates sales by region and departments in the top_regions result set:


WITH
-- query sale amounts for each region
   regional_sales (region, total_sales) AS (
        SELECT sd.store_region, SUM(of.total_order_cost) AS total_sales
        FROM store.store_dimension sd JOIN store.store_orders_fact of ON sd.store_key = of.store_key
        GROUP BY store_region ),
-- query previous result set
   top_regions AS (
        SELECT region, total_sales
        FROM regional_sales ORDER BY total_sales DESC LIMIT 3
     )

-- primary query
-- aggregate sales in top_regions result set
SELECT sd.store_region AS region, pd.department_description AS department, SUM(of.total_order_cost) AS product_sales
FROM store.store_orders_fact of
JOIN store.store_dimension sd ON sd.store_key = of.store_key
JOIN public.product_dimension pd ON of.product_key = pd.product_key
WHERE sd.store_region IN (SELECT region FROM top_regions)
GROUP BY ROLLUP (region, department) ORDER BY region, product_sales DESC, GROUPING_ID();

 region  |            department            | product_sales
---------+----------------------------------+---------------
 East    |                                  |    1716917786
 East    | Meat                             |     189837962
 East    | Produce                          |     170607880
 East    | Photography                      |     162271618
 East    | Frozen Goods                     |     141077867
 East    | Gifts                            |     137604397
 East    | Bakery                           |     136497842
 East    | Liquor                           |     130410463
 East    | Canned Goods                     |     128683257
 East    | Cleaning supplies                |     118996326
 East    | Dairy                            |     118866901
 East    | Seafood                          |     109986665
 East    | Medical                          |     100404891
 East    | Pharmacy                         |      71671717
 MidWest |                                  |    1287550770
 MidWest | Meat                             |     141446607
 MidWest | Produce                          |     125156100
 MidWest | Photography                      |     122666753
 MidWest | Frozen Goods                     |     105893534
 MidWest | Gifts                            |     103088595
 MidWest | Bakery                           |     102844467
 MidWest | Canned Goods                     |      97647270
 MidWest | Liquor                           |      97306898
 MidWest | Cleaning supplies                |      90775242
 MidWest | Dairy                            |      89065443
 MidWest | Seafood                          |      82541528
 MidWest | Medical                          |      76674814
 MidWest | Pharmacy                         |      52443519
 West    |                                  |    2159765937
 West    | Meat                             |     235841506
 West    | Produce                          |     215277204
 West    | Photography                      |     205949467
 West    | Frozen Goods                     |     178311593
 West    | Bakery                           |     172824555
 West    | Gifts                            |     172134780
 West    | Liquor                           |     164798022
 West    | Canned Goods                     |     163330813
 West    | Cleaning supplies                |     148776443
 West    | Dairy                            |     145244575
 West    | Seafood                          |     139464407
 West    | Medical                          |     126184049
 West    | Pharmacy                         |      91628523
         |                                  |    5164234493
(43 rows)

INSERT statement that includes WITH clause

The following SQL uses a WITH clause to insert data from a JOIN query into table total_store_sales:

CREATE TABLE total_store_sales (store_key int, region VARCHAR(20), store_sales numeric (12,2));

INSERT INTO total_store_sales
WITH store_sales AS (
        SELECT sd.store_key, sd.store_region::VARCHAR(20), SUM (of.total_order_cost)
        FROM store.store_dimension sd JOIN store.store_orders_fact of ON sd.store_key = of.store_key
        GROUP BY sd.store_region, sd.store_key ORDER BY sd.store_region, sd.store_key)
SELECT * FROM store_sales;

=> SELECT * FROM total_store_sales ORDER BY region, store_key;
 store_key |  region   | store_sales
-----------+-----------+-------------
         2 | East      | 47668303.00
         6 | East      | 48136354.00
        12 | East      | 46673113.00
        22 | East      | 48711211.00
        24 | East      | 48603836.00
        31 | East      | 46836469.00
        36 | East      | 48461449.00
        37 | East      | 48018279.00
        41 | East      | 48713084.00
        44 | East      | 47808362.00
        49 | East      | 46990023.00
        50 | East      | 47643329.00
         9 | MidWest   | 46851087.00
        15 | MidWest   | 48787354.00
        27 | MidWest   | 48497620.00
        29 | MidWest   | 47639234.00
        30 | MidWest   | 49013483.00
        38 | MidWest   | 48856012.00
        42 | MidWest   | 47297912.00
        45 | MidWest   | 48544521.00
        46 | MidWest   | 48887255.00
         4 | NorthWest | 47580215.00
        39 | NorthWest | 47136892.00
        47 | NorthWest | 48477574.00
         8 | South     | 48131455.00
        13 | South     | 47605422.00
        17 | South     | 46054367.00
...
(50 rows)

1 - Inline expansion of WITH clause

By default, Vertica uses inline expansion to evaluate WITH clauses.

By default, Vertica uses inline expansion to evaluate WITH clauses. Vertica evaluates each WITH clause every time it is referenced by the primary query. Inline expansion often works best if the query does not reference the same WITH clause multiple times, or if some local optimizations are possible after inline expansion.

Example

The following example shows a WITH clause that is a good candidate for inline expansion. The WITH clause is used in a query that obtains order information for all 2007 orders shipped between December 01-07:


-- Begin WITH
WITH store_orders_fact_new AS(
   SELECT * FROM store.store_orders_fact WHERE date_shipped between '2007-12-01' and '2007-12-07')
-- End WITH
-- Begin primary query
SELECT store_key, product_key, product_version, SUM(quantity_ordered*unit_price) AS total_price
FROM store_orders_fact_new
GROUP BY store_key, product_key, product_version
ORDER BY total_price DESC;

 store_key | product_key | product_version | total_price
-----------+-------------+-----------------+-------------
       232 |        1855 |               2 |       29008
       125 |        8500 |               4 |       28812
       139 |        3707 |               2 |       28812
       212 |        3203 |               1 |       28000
       236 |        8023 |               4 |       27548
       123 |       10598 |               2 |       27146
        34 |        8888 |               4 |       27100
       203 |        2243 |               1 |       27027
       117 |       13932 |               2 |       27000
        84 |         768 |               1 |       26936
       123 |        1038 |               1 |       26885
       106 |       18932 |               1 |       26864
        93 |       10395 |               3 |       26790
       162 |       13073 |               1 |       26754
        15 |        3679 |               1 |       26675
        52 |        5957 |               5 |       26656
       190 |        8114 |               3 |       26611
         5 |        7772 |               1 |       26588
       139 |        6953 |               3 |       26572
       202 |       14735 |               1 |       26404
       133 |        2740 |               1 |       26312
       198 |        8545 |               3 |       26287
       221 |        7582 |               2 |       26280
       127 |        9468 |               3 |       26224
        63 |        8115 |               4 |       25960
       171 |        2088 |               1 |       25650
       250 |       11210 |               3 |       25608
...

Vertica processes the query as follows:

  1. Expands the WITH clause reference to store_orders_fact_new within the primary query.

  2. After expanding the WITH clause, evaluates the primary query.

2 - Materialization of WITH clause

When materialization is enabled, Vertica evaluates each WITH clause once, stores results in a temporary table, and references this table as often as the query requires.

When materialization is enabled, Vertica evaluates each WITH clause once, stores results in a temporary table, and references this table as often as the query requires. Vertica drops the temporary table after primary query execution completes.

Materialization can facilitate better performance when WITH clauses are complex—for example, when the WITH clauses contain JOIN and GROUP BY clauses, and are referenced multiple times in the primary query.

If materialization is enabled, WITH statements perform an auto-commit of the user transaction. This occurs even when using EXPLAIN with the WITH statement.

Enabling WITH clause materialization

WITH materialization is set by configuration parameter WithClauseMaterialization, by default set to 0 (disabled). You can enable and disable materialization by setting WithClauseMaterialization at database and session levels, with ALTER DATABASE and ALTER SESSION, respectively:

  • Database:

    => ALTER DATABASE db-spec SET PARAMETER WithClauseMaterialization={ 0 | 1 };
    => ALTER DATABASE db-spec CLEAR PARAMETER WithClauseMaterialization;
    
  • Session: Parameter setting remains in effect until you explicitly clear it, or the session ends.

    => ALTER SESSION SET PARAMETER WithClauseMaterialization={ 0 | 1 };
    => ALTER SESSION CLEAR PARAMETER WithClauseMaterialization;
    

You can also enable WITH materialization for individual queries with the hint ENABLE_WITH_CLAUSE_MATERIALIZATION. Materialization is automatically cleared when the query returns. For example:


=> WITH /*+ENABLE_WITH_CLAUSE_MATERIALIZATION */ revenue AS (
      SELECT vendor_key, SUM(total_order_cost) AS total_revenue
      FROM store.store_orders_fact
      GROUP BY vendor_key ORDER BY 1)
     ...

Processing WITH clauses using EE5 temp relations

By default, when WITH clause queries are reused, Vertica saves those WITH clause query outputs in EE5 temp relations. However, this option can be changed. EE5 temp relation support is set by configuration parameter EnableWITHTempRelReuseLimit, which can be set in the following ways:

  • 0: Disables this feature.

  • 1: Force-saves all WITH clause queries into EE5 temp relations, whether or not they are reused.

  • 2 (default): Saves only reused WITH clause queries into EE5 temp relations.

  • 3 or more: Saves WITH clause queries into EE5 temp relations only when they are used at least this number of times.

EnableWITHTempRelReuseLimit can be set at database and session levels, with ALTER DATABASE and ALTER SESSION, respectively. When WithClauseMaterialization is set to 1, that setting overrides any EnableWITHTempRelReuseLimit settings.

Note that for WITH queries with complex types, temp relations are disabled.

Example

The following example shows a WITH clause that is a good candidate for materialization. The query obtains data for the vendor who has the highest combined order cost for all orders:

-- Enable materialization
=> ALTER SESSION SET PARAMETER WithClauseMaterialization=1;

-- Define WITH clause
=> WITH revenue AS (
      SELECT vendor_key, SUM(total_order_cost) AS total_revenue
      FROM store.store_orders_fact
      GROUP BY vendor_key ORDER BY 1)
-- End WITH clause

-- Primary query
=> SELECT vendor_name, vendor_address, vendor_city, total_revenue
FROM vendor_dimension v, revenue r
WHERE v.vendor_key = r.vendor_key AND total_revenue = (SELECT MAX(total_revenue) FROM revenue )
ORDER BY vendor_name;
   vendor_name    | vendor_address | vendor_city | total_revenue
------------------+----------------+-------------+---------------
 Frozen Suppliers | 471 Mission St | Peoria      |      49877044
(1 row)

Vertica processes this query as follows:

  1. WITH clause revenue evaluates its SELECT statement from table store.store_orders_fact.

  2. Results of the revenue clause are stored in a local temporary table.

  3. Whenever the revenue clause statement is referenced, the results stored in the table are used.

  4. The temporary table is dropped when query execution is complete.

3 - WITH clause recursion

A WITH clause that includes the RECURSIVE option iterates over its own output through repeated execution of a UNION or UNION ALL query.

A WITH clause that includes the RECURSIVE option iterates over its own output through repeated execution of a UNION or UNION ALL query. Recursive queries are useful when working with self-referential data—hierarchies such as manager-subordinate relationships, or tree-structured data such as taxonomies.

The configuration parameter WithClauseRecursionLimit—by default set to 8—sets the maximum depth of recursion. You can set this parameter at database and session scopes with ALTER DATABASE and ALTER SESSION, respectively. Recursion continues until it reaches the configured maximum depth, or until the last iteration returns with no data.

You specify a recursive WITH clause as follows:

WITH [ /*+ENABLE_WITH_CLAUSE_MATERIALIZATION*/ ] RECURSIVE
   cte-identifier [ ( column-aliases ) ] AS (
     non-recursive-term
     UNION [ ALL ]
     recursive-term
   )

Non-recursive and recursive terms are separated by UNION or UNION ALL:

  • The non-recursive-term query sets its result set in cte-identifier, which is subject to recursion in recursive-term.

  • The UNION statement's recursive-term recursively iterates over its own output. When recursion is complete, the results of all iterations are compiled and set in cte-identifier.

For example:


=> ALTER SESSION SET PARAMETER WithClauseRecursionLimit=4; -- maximum recursion depth = 4
=> WITH RECURSIVE nums (n) AS (
   SELECT 1 -- non-recursive (base) term
   UNION ALL
     SELECT n+1 FROM nums -- recursive term
  )
SELECT n FROM nums; -- primary query

This simple query executes as follows:

  1. Executes the WITH RECURSIVE clause:

    • Evaluates the non-recursive term SELECT 1, and places the result set—1—in nums.

    • Iterates over the UNION ALL query (SELECT n+1) until the number of iterations is greater than the configuration parameter WithClauseRecursionLimit.

    • Combines the results of all UNION queries and sets the result set in nums, and then exits to the primary query.

  2. Executes the primary query SELECT n FROM nums:

    
     n
    ---
     1
     2
     3
     4
     5
    (5 rows)
    

In this case , WITH RECURSIVE clause exits after four iterations as per WithClauseRecursionLimit. If you restore WithClauseRecursionLimit to its default value of 8, then the clause exits after eight iterations:


=> ALTER SESSION CLEAR PARAMETER WithClauseRecursionLimit;
=> WITH RECURSIVE nums (n) AS (
   SELECT 1
   UNION ALL
     SELECT n+1 FROM nums
  )
SELECT n FROM nums;
 n
---
 1
 2
 3
 4
 5
 6
 7
 8
 9
(9 rows)

Restrictions

The following restrictions apply:

  • The SELECT list of a non-recursive term cannot include the wildcard * (asterisk) or the function MATCH_COLUMNS.

  • A recursive term can reference the target CTE only once.

  • Recursive reference cannot appear within an outer join.

  • Recursive reference cannot appear within a subquery.

  • WITH clauses do not support UNION options ORDER BY, LIMIT, and OFFSET.

Examples

A small software company maintains the following data on employees and their managers:

=> SELECT * FROM personnel.employees ORDER BY emp_id;
 emp_id |   fname   |   lname   | section_id |    section_name     |  section_leader  | leader_id
--------+-----------+-----------+------------+---------------------+------------------+-----------
      0 | Stephen   | Mulligan  |          0 |                     |                  |
      1 | Michael   | North     |        201 | Development         | Zoe Black        |         3
      2 | Megan     | Berry     |        202 | QA                  | Richard Chan     |        18
      3 | Zoe       | Black     |        101 | Product Development | Renuka Patil     |        24
      4 | Tim       | James     |        203 | IT                  | Ebuka Udechukwu  |        17
      5 | Bella     | Tucker    |        201 | Development         | Zoe Black        |         3
      6 | Alexandra | Climo     |        202 | QA                  | Richard Chan     |        18
      7 | Leonard   | Gray      |        203 | IT                  | Ebuka Udechukwu  |        17
      8 | Carolyn   | Henderson |        201 | Development         | Zoe Black        |         3
      9 | Ryan      | Henderson |        201 | Development         | Zoe Black        |         3
     10 | Frank     | Tucker    |        205 | Sales               | Benjamin Glover  |        29
     11 | Nathan    | Ferguson  |        102 | Sales Marketing     | Eric Redfield    |        28
     12 | Kevin     | Rampling  |        101 | Product Development | Renuka Patil     |        24
     13 | Tuy Kim   | Duong     |        201 | Development         | Zoe Black        |         3
     14 | Dwipendra | Sing      |        204 | Tech Support        | Sarah Feldman    |        26
     15 | Dylan     | Wijman    |        206 | Documentation       | Kevin Rampling   |        12
     16 | Tamar     | Sasson    |        207 | Marketing           | Nathan Ferguson  |        11
     17 | Ebuka     | Udechukwu |        101 | Product Development | Renuka Patil     |        24
     18 | Richard   | Chan      |        101 | Product Development | Renuka Patil     |        24
     19 | Maria     | del Rio   |        201 | Development         | Zoe Black        |         3
     20 | Hua       | Song      |        204 | Tech Support        | Sarah Feldman    |        26
     21 | Carmen    | Lopez     |        204 | Tech Support        | Sarah Feldman    |        26
     22 | Edgar     | Mejia     |        206 | Documentation       | Kevin Rampling   |        12
     23 | Riad      | Salim     |        201 | Development         | Zoe Black        |         3
     24 | Renuka    | Patil     |        100 | Executive Office    | Stephen Mulligan |         0
     25 | Rina      | Dsouza    |        202 | QA                  | Richard Chan     |        18
     26 | Sarah     | Feldman   |        101 | Product Development | Renuka Patil     |        24
     27 | Max       | Mills     |        102 | Sales Marketing     | Eric Redfield    |        28
     28 | Eric      | Redfield  |        100 | Executive Office    | Stephen Mulligan |         0
     29 | Benjamin  | Glover    |        102 | Sales Marketing     | Eric Redfield    |        28
     30 | Dominic   | King      |        205 | Sales               | Benjamin Glover  |        29
     32 | Ryan      | Metcalfe  |        206 | Documentation       | Kevin Rampling   |        12
     33 | Piers     | Paige     |        201 | Development         | Zoe Black        |         3
     34 | Nicola    | Kelly     |        207 | Marketing           | Nathan Ferguson  |        11
(34 rows)

You can query this data for employee-manager relationships through WITH RECURSIVE. For example, the following query's WITH RECURSIVE clause gets employee-manager relationships for employee Eric Redfield, including all employees who report directly and indirectly to him:

WITH RECURSIVE managers (employeeID, employeeName, sectionID, section, lead, leadID)
 AS (SELECT emp_id, fname||' '||lname, section_id, section_name, section_leader, leader_id
      FROM personnel.employees WHERE fname||' '||lname = 'Eric Redfield'
 UNION
    SELECT emp_id, fname||' '||lname AS employee_name, section_id, section_name, section_leader, leader_id FROM personnel.employees e
      JOIN managers m ON m.employeeID = e.leader_id)
 SELECT employeeID, employeeName, lead AS 'Reports to', section, leadID from managers ORDER BY sectionID, employeeName;

The WITH RECURSIVE clause defines the CTE managers, and then executes in two phases:

  1. The non-recursive term populates managers with data that it queries from personnel.employees.

  2. The recursive term's UNION query iterates over its own output until, on the fourth cycle, it finds no more data. The results of all iterations are then compiled and set in managers, and the WITH CLAUSE exits to the primary query.

The primary query returns three levels of data from managers—one for each recursive iteration:

Similarly, the following query iterates over the same data to get all employee-manager relationships for employee Richard Chan, who is one level lower in the company chain of command:

WITH RECURSIVE managers (employeeID, employeeName, sectionID, section, lead, leadID)
 AS (SELECT emp_id, fname||' '||lname, section_id, section_name, section_leader, leader_id
      FROM personnel.employees WHERE fname||' '||lname = 'Richard Chan'
 UNION
    SELECT emp_id, fname||' '||lname AS employee_name, section_id, section_name, section_leader, leader_id FROM personnel.employees e
      JOIN managers m ON m.employeeID = e.leader_id)
 SELECT employeeID, employeeName, lead AS 'Reports to', section, leadID from managers ORDER BY sectionID, employeeName;

The WITH RECURSIVE clause executes as before, except this time it finds no more data after two iterations and exits. Accordingly, the primary query returns two levels of data from managers:

WITH RECURSIVE materialization

By default, materialization is disabled. In this case, Vertica rewrites the WITH RECURSIVE query into subqueries, as many as necessary for the required level of recursion.

If recursion is very deep, the high number of query rewrites is liable to incur considerable overhead that adversely affects performance and exhausts system resources. In this case, consider enabling materialization, either with the configuration parameter WithClauseMaterialization, or the hint ENABLE_WITH_CLAUSE_MATERIALIZATION. In either case, intermediate result sets from all recursion levels are written to local temporary tables. When recursion is complete, the intermediate results in all temporary tables are compiled and passed on to the primary query.