UNION clause
Combines the results of multiple SELECT statements. You can include UNION in FROM, WHERE, and HAVING clauses.
Syntax
select-stmt { UNION [ ALL | DISTINCT ] select-stmt }[...]
[ order-by-clause [ offset-clause ]]
[ limit-clause ]
Parameters
select-stmt
- A
SELECT
statement that returns one or more rows, depending on whether you specify keywords DISTINCT or ALL.The following options also apply:
-
The first SELECT statement can include the LABEL hint. Vertica ignores LABEL hints in subsequent SELECT statements.
-
Each SELECT statement can specify its own ORDER BY, LIMIT, and OFFSET clauses. A SELECT statement with one or more of these clauses must be enclosed by parentheses. See also: ORDER BY, LIMIT, and OFFSET Clauses in UNION.
-
DISTINCT | ALL
- Specifies whether to return unique rows:
-
DISTINCT (default) returns only unique rows.
-
ALL concatenates all rows, including duplicates. For best performance, use UNION ALL.
-
Requirements
-
All rows of the UNION result set must be in the result set of at least one of its SELECT statements.
-
Each SELECT statement must specify the same number of columns.
-
Data types of corresponding SELECT statement columns must be compatible, otherwise Vertica returns an error.
ORDER BY, LIMIT, and OFFSET clauses in UNION
A UNION statement can specify its own ORDER BY, LIMIT, and OFFSET clauses. For example, given the tables described below in Examples, the following query orders the UNION result set by emp_name
and limits output to the first two rows:
=> SELECT id, emp_name FROM company_a UNION ALL SELECT id, emp_name FROM company_b ORDER BY emp_name LIMIT 2;
id | emp_name
------+----------
5678 | Alice
8765 | Bob
(2 rows)
Each SELECT statement in a UNION clause can specify its own ORDER BY, LIMIT, and OFFSET clauses. In this case, the SELECT statement must be enclosed by parentheses. Vertica processes the SELECT statement ORDER BY, LIMIT, and OFFSET clauses before it processes the UNION clauses.
For example, each SELECT statement in the following UNION specifies its own ORDER BY and LIMIT clauses. Vertica processes the individual queries and then concatenates the two result sets:
=> (SELECT id, emp_name FROM company_a ORDER BY emp_name LIMIT 2)
UNION ALL
(SELECT id, emp_name FROM company_b ORDER BY emp_name LIMIT 2);
id | emp_name
------+-----------
5678 | Alice
9012 | Katherine
8765 | Bob
9012 | Katherine
(4 rows)
The following requirements and restrictions determine how Vertica processes a UNION clause that contains ORDER BY, LIMIT, and OFFSET clauses:
-
A UNION's ORDER BY clause must specify columns from the first (leftmost) SELECT statement.
-
Always use an ORDER BY clause with LIMIT and OFFSET. Otherwise, the query returns an undefined subset of the result set.
-
ORDER BY must precede LIMIT and OFFSET.
-
When a SELECT or UNION statement specifies both LIMIT and OFFSET, Vertica first processes the OFFSET statement, and then applies the LIMIT statement to the remaining rows.
UNION in non-correlated subqueries
Vertica supports UNION in noncorrelated subquery predicates. For example:
=> SELECT DISTINCT customer_key, customer_name FROM public.customer_dimension WHERE customer_key IN
(SELECT customer_key FROM store.store_sales_fact WHERE sales_dollar_amount > 500
UNION ALL
SELECT customer_key FROM online_sales.online_sales_fact WHERE sales_dollar_amount > 500)
AND customer_state = 'CT';
customer_key | customer_name
--------------+------------------------
7021 | Luigi T. Dobisz
1971 | Betty V. Dobisz
46284 | Ben C. Gauthier
33885 | Tanya Y. Taylor
5449 | Sarah O. Robinson
29059 | Sally Z. Fortin
11200 | Foodhope
15582 | John J. McNulty
24638 | Alexandra F. Jones
...
Examples
The examples that follow use these two tables:
company_a
ID emp_name dept sales
------+------------+-------------+-------
1234 | Stephen | auto parts | 1000
5678 | Alice | auto parts | 2500
9012 | Katherine | floral | 500
company_b
ID emp_name dept sales
------+------------+-------------+-------
4321 | Marvin | home goods | 250
9012 | Katherine | home goods | 500
8765 | Bob | electronics | 20000
Find all employee IDs and names from company_a and company_b
The UNION statement specifies DISTINCT to combine unique IDs and last names of employees; Katherine works for both companies, so she appears only once in the result set. DISTINCT is the default and can be omitted:
=> SELECT id, emp_name FROM company_a UNION DISTINCT SELECT id, emp_name FROM company_b ORDER BY id;
id | emp_name
------+-----------
1234 | Stephen
4321 | Marvin
5678 | Alice
8765 | Bob
9012 | Katherine
(5 rows)
The next UNION statement specifies the option ALL. Katherine works for both companies, so the query returns two records for her:
=> SELECT id, emp_name FROM company_a UNION ALL SELECT id, emp_name FROM company_b ORDER BY id;
id | emp_name
------+-----------
1234 | Stephen
5678 | Alice
9012 | Katherine
4321 | Marvin
9012 | Katherine
8765 | Bob
(6 rows)
Find the top two top performing salespeople in each company
Each SELECT statement specifies its own ORDER BY and LIMIT clauses, so the UNION statement concatenates the result sets as returned by each query:
=> (SELECT id, emp_name, sales FROM company_a ORDER BY sales DESC LIMIT 2)
UNION ALL
(SELECT id, emp_name, sales FROM company_b ORDER BY sales DESC LIMIT 2);
id | emp_name | sales
------+-----------+-------
8765 | Bob | 20000
5678 | Alice | 2500
1234 | Stephen | 1000
9012 | Katherine | 500
(4 rows)
Find all employee orders by sales
The UNION statement specifies its own ORDER BY clause, which Vertica applies to the entire result:
=> SELECT id, emp_name, sales FROM company_a
UNION
SELECT id, emp_name, sales FROM company_b
ORDER BY sales;
id | emp_name | sales
------+-----------+-------
4321 | Marvin | 250
9012 | Katherine | 500
1234 | Stephen | 1000
5678 | Alice | 2500
8765 | Bob | 20000
(5 rows)
Calculate the sum of sales for each company grouped by department
Each SELECT statement has its own GROUP BY clause. UNION combines the aggregate results from each query:
=> (SELECT 'Company A' as company, dept, SUM(sales) FROM company_a
GROUP BY dept)
UNION
(SELECT 'Company B' as company, dept, SUM(sales) FROM company_b
GROUP BY dept)
ORDER BY 1;
company | dept | sum
-----------+-------------+-------
Company A | auto parts | 3500
Company A | floral | 500
Company B | electronics | 20000
Company B | home goods | 750
(4 rows)