UNION clause

Combines the results of multiple SELECT statements.

Combines the results of multiple SELECT statements. You can include UNION in FROM, WHERE, and HAVING clauses.

Syntax

select { UNION [ ALL | DISTINCT ] select }[...]
    [ order-by-clause  [ offset-clause ]]
    [ limit-clause ]

Arguments

select
A SELECT statement that returns one or more rows, depending on whether you specify keywords DISTINCT or ALL.

The first SELECT statement can include the LABEL hint. Vertica ignores LABEL hints in subsequent SELECT statements.

Each SELECT statement can specify its own ORDER BY, LIMIT, and OFFSET clauses. A SELECT statement with one or more of these clauses must be enclosed by parentheses. See also: ORDER BY, LIMIT, and OFFSET Clauses in UNION.

DISTINCT, ALL
How to return duplicate rows:
  • DISTINCT (default) returns only unique rows.

  • ALL concatenates all rows, including duplicates. For best performance, use UNION ALL.

UNION ALL supports columns of complex types; UNION DISTINCT does not.

Requirements

  • Each row of the UNION result set must be in the result set of at least one of its SELECT statements.

  • Each SELECT statement must specify the same number of columns.

  • Data types of corresponding SELECT statement columns must be compatible, otherwise Vertica returns an error.

ORDER BY, LIMIT, and OFFSET clauses in UNION

A UNION statement can specify its own ORDER BY, LIMIT, and OFFSET clauses, as in the following example:

=> SELECT id, emp_name FROM company_a UNION ALL SELECT id, emp_name FROM company_b ORDER BY emp_name LIMIT 2;
  id  | emp_name
------+----------
 5678 | Alice
 8765 | Bob
(2 rows)

Each SELECT statement in a UNION clause can specify its own ORDER BY, LIMIT, and OFFSET clauses. Vertica processes the SELECT statement clauses before it processes the UNION clauses. In the following example, Vertica processes the individual queries and then concatenates the two result sets:

=> (SELECT id, emp_name FROM company_a ORDER BY emp_name LIMIT 2)
   UNION ALL
   (SELECT id, emp_name FROM company_b ORDER BY emp_name LIMIT 2);
  id  | emp_name
------+-----------
 5678 | Alice
 9012 | Katherine
 8765 | Bob
 9012 | Katherine
(4 rows)

The following requirements and restrictions determine how Vertica processes a UNION clause that contains ORDER BY, LIMIT, and OFFSET clauses:

  • A UNION's ORDER BY clause must specify columns from the first (leftmost) SELECT statement.

  • ORDER BY must precede LIMIT and OFFSET.

  • When a SELECT or UNION statement specifies both LIMIT and OFFSET, Vertica first processes the OFFSET statement, and then applies the LIMIT statement to the remaining rows.

UNION in non-correlated subqueries

Vertica supports UNION in noncorrelated subquery predicates. For example:

=> SELECT DISTINCT customer_key, customer_name FROM public.customer_dimension
   WHERE customer_key IN
     (SELECT customer_key FROM store.store_sales_fact WHERE sales_dollar_amount > 500
      UNION ALL
      SELECT customer_key FROM online_sales.online_sales_fact WHERE sales_dollar_amount > 500)
   AND customer_state = 'CT';
 customer_key |     customer_name
--------------+------------------------
         7021 | Luigi T. Dobisz
         1971 | Betty V. Dobisz
        46284 | Ben C. Gauthier
        33885 | Tanya Y. Taylor
         5449 | Sarah O. Robinson
        29059 | Sally Z. Fortin
        11200 | Foodhope
        15582 | John J. McNulty
        24638 | Alexandra F. Jones
 ...

UNION ALL with complex types

You can use UNION ALL with complex types. Consider a table with the following definition:

=> CREATE TABLE restaurants(
    name VARCHAR, cuisine VARCHAR,
    locations ARRAY[ROW(city VARCHAR(50), state VARCHAR(2)),50],
    menu ARRAY[ROW(item VARCHAR(50), price FLOAT),100] );

Suppose you are in a new city looking for a place to eat. The database has information about the following restaurants:

=> SELECT name, cuisine FROM restaurants
   WHERE CONTAINS(locations,ROW('Pittsburgh', 'PA'));
       name        | cuisine
-------------------+----------
 Bakersfield Tacos | Mexican
 Bob's pizzeria    | Italian
 Succulent Steaks  | American
 Sushi House       | Asian
 Villa Milano      | Italian
(5 rows)

Suppose you are hungry for Italian food. If you cannot have Italian, you want something inexpensive. The following query uses two SELECT clauses from the same table, one finding menu items for Italian restaurants and one finding menu items under $10:

=> WITH menu_entries AS
    (SELECT name, cuisine,
            EXPLODE(menu USING PARAMETERS skip_partitioning=true) AS (idx, menu_entry)
     FROM restaurants WHERE CONTAINS(locations,ROW('Pittsburgh', 'PA')))
   SELECT name, cuisine, menu_entry FROM menu_entries WHERE cuisine = 'Italian'
   UNION ALL
   SELECT name, cuisine, menu_entry FROM menu_entries WHERE menu_entry.price <= 10;
       name        | cuisine |                 menu_entry
-------------------+---------+--------------------------------------------
 Bob's pizzeria    | Italian | {"item":"cheese pizza","price":8.25}
 Bob's pizzeria    | Italian | {"item":"spinach pizza","price":10.5}
 Villa Milano      | Italian | {"item":"pasta carbonara","price":24.99}
 Villa Milano      | Italian | {"item":"eggplant parmesan","price":23.49}
 Villa Milano      | Italian | {"item":"herbed salmon","price":28.99}
 Bakersfield Tacos | Mexican | {"item":"veggie taco","price":9.95}
 Bob's pizzeria    | Italian | {"item":"cheese pizza","price":8.25}
(7 rows)

You cannot use LIMIT OVER with UNION ALL if the selected columns are of complex types. In this case, the statement returns an error like "Multi-value expressions are not supported in this context". You can still use LIMIT OVER in a single SELECT statement by using parentheses to make the scoping explicit.

Examples

The examples that follow use these two tables:

=> SELECT * FROM company_a;
  ID    emp_name       dept       sales
------+------------+-------------+-------
1234  | Stephen    | auto parts  | 1000
5678  | Alice      | auto parts  | 2500
9012  | Katherine  | floral      |  500

=> SELECT * FROM company_b;
  ID    emp_name       dept       sales
------+------------+-------------+-------
4321  | Marvin     | home goods  |   250
9012  | Katherine  | home goods  |   500
8765  | Bob        | electronics | 20000

The following query finds all employee IDs and names from the two tables. The UNION statement uses DISTINCT to combine unique IDs and last names of employees. Katherine works for both companies, so she appears only once in the result set. DISTINCT is the default and can be omitted:

=> SELECT id, emp_name FROM company_a
   UNION DISTINCT SELECT id, emp_name FROM company_b ORDER BY id;
  id  | emp_name
------+-----------
 1234 | Stephen
 4321 | Marvin
 5678 | Alice
 8765 | Bob
 9012 | Katherine
(5 rows)

If the UNION statement instead uses ALL, the query returns two records for Katherine:

=> SELECT id, emp_name FROM company_a
   UNION ALL SELECT id, emp_name FROM company_b ORDER BY id;
  id  | emp_name
------+-----------
 1234 | Stephen
 5678 | Alice
 9012 | Katherine
 4321 | Marvin
 9012 | Katherine
 8765 | Bob
(6 rows)

The following query returns the top two salespeople in each company. Each SELECT statement specifies its own ORDER BY and LIMIT clauses, and the UNION statement concatenates the result sets as returned by each query:

=> (SELECT id, emp_name, sales FROM company_a ORDER BY sales DESC LIMIT 2)
   UNION ALL
   (SELECT id, emp_name, sales FROM company_b ORDER BY sales DESC LIMIT 2);
  id  |  emp_name | sales
------+-----------+-------
 8765 | Bob       | 20000
 5678 | Alice     |  2500
 1234 | Stephen   |  1000
 9012 | Katherine |   500
(4 rows)

The following query returns all employees in both companies with an overall ordering. The ORDER BY clause is part of the UNION statement:

=> SELECT id, emp_name, sales FROM company_a
   UNION
   SELECT id, emp_name, sales FROM company_b
   ORDER BY sales;
  id  |  emp_name | sales
------+-----------+-------
 4321 | Marvin    |   250
 9012 | Katherine |   500
 1234 | Stephen   |  1000
 5678 | Alice     |  2500
 8765 | Bob       | 20000
(5 rows)

The following query groups total sales by department within each company. Each SELECT statement has its own GROUP BY clause. UNION combines the aggregate results from each query:

=> (SELECT 'Company A' as company, dept, SUM(sales) FROM company_a
    GROUP BY dept)
    UNION
   (SELECT 'Company B' as company, dept, SUM(sales) FROM company_b
    GROUP BY dept)
    ORDER BY 1;
 company   |    dept     |  sum
-----------+-------------+-------
 Company A | auto parts  |  3500
 Company A | floral      |   500
 Company B | electronics | 20000
 Company B | home goods  |   750
(4 rows)

See also