EXISTS and NOT EXISTS
The EXISTS
predicate is one of the most common predicates used to build conditions that use noncorrelated and correlated subqueries. Use EXISTS
to identify the existence of a relationship without regard for the quantity. For example, EXISTS
returns true if the subquery returns any rows, and [NOT] EXISTS
returns true if the subquery returns no rows.
[NOT] EXISTS
subqueries take the following form:
expression [ NOT ] EXISTS ( subquery )
The EXISTS
condition is considered to be met if the subquery returns at least one row. Since the result depends only on whether any records are returned, and not on the contents of those records, the output list of the subquery is normally uninteresting. A common coding convention is to write all EXISTS
tests as follows:
EXISTS (SELECT 1 WHERE ...)
In the above fragment, SELECT 1
returns the value 1 for every record in the query. If the query returns, for example, five records, it returns 5 ones. The system doesn't care about the real values in those records; it just wants to know if a row is returned.
Alternatively, a subquery’s select list that uses EXISTS
might consist of the asterisk (*). You do not need to specify column names, because the query tests for the existence or nonexistence of records that meet the conditions specified in the subquery.
EXISTS (SELECT * WHERE ...)
Notes
-
If
EXISTS (
subquery
)
returns at least 1 row, the result is TRUE. -
If
EXISTS (
subquery
)
returns no rows, the result is FALSE. -
If
NOT EXISTS (subquery)
returns at least 1 row, the result is FALSE. -
If
NOT EXISTS (subquery)
returns no rows, the result is TRUE.
Examples
The following query retrieves the list of all the customers who purchased anything from any of the stores amounting to more than 550 dollars:
=> SELECT customer_key, customer_name, customer_state
FROM public.customer_dimension WHERE EXISTS
(SELECT 1 FROM store.store_sales_fact
WHERE customer_key = public.customer_dimension.customer_key
AND sales_dollar_amount > 550)
AND customer_state = 'MA' ORDER BY customer_key;
customer_key | customer_name | customer_state
--------------+--------------------+----------------
14818 | William X. Nielson | MA
18705 | James J. Goldberg | MA
30231 | Sarah N. McCabe | MA
48353 | Mark L. Brown | MA
(4 rows)
Whether you use EXISTS or IN subqueries depends on which predicates you select in outer and inner query blocks. For example, to get a list of all the orders placed by all stores on January 2, 2003 for vendors with records in the vendor table:
=> SELECT store_key, order_number, date_ordered
FROM store.store_orders_fact WHERE EXISTS
(SELECT 1 FROM public.vendor_dimension
WHERE public.vendor_dimension.vendor_key = store.store_orders_fact.vendor_key)
AND date_ordered = '2012-01-02';
store_key | order_number | date_ordered
-----------+--------------+--------------
37 | 2559 | 2012-01-02
16 | 552 | 2012-01-02
35 | 1156 | 2012-01-02
13 | 3885 | 2012-01-02
25 | 554 | 2012-01-02
21 | 2687 | 2012-01-02
49 | 3251 | 2012-01-02
19 | 2922 | 2012-01-02
26 | 1329 | 2012-01-02
40 | 1183 | 2012-01-02
(10 rows)
The above query looks for existence of the vendor and date ordered. To return a particular value, rather than simple existence, the query looks for orders placed by the vendor who got the best deal on January 4, 2004:
=> SELECT store_key, order_number, date_ordered
FROM store.store_orders_fact ord, public.vendor_dimension vd
WHERE ord.vendor_key = vd.vendor_key AND vd.deal_size IN
(SELECT MAX(deal_size) FROM public.vendor_dimension)
AND date_ordered = '2013-01-04';
store_key | order_number | date_ordered
-----------+--------------+--------------
166 | 36008 | 2013-01-04
113 | 66017 | 2013-01-04
198 | 75716 | 2013-01-04
27 | 150241 | 2013-01-04
148 | 182207 | 2013-01-04
9 | 188567 | 2013-01-04
45 | 202416 | 2013-01-04
24 | 250295 | 2013-01-04
121 | 251417 | 2013-01-04
(9 rows)