IN
Checks whether a single value is found (or not found) within a set of values.
Syntax
(column-list) [ NOT ] IN ( values-list )
Parameters
column-list |
One or more comma-delimited columns in the queried tables. |
values-list |
Comma-delimited list of constant values to find in the You can specify multiple sets of values as follows:
|
Null handling
Vertica supports multicolumn NOT IN subqueries where the columns are not marked
NOT NULL
. If one of the columns is found to contain a NULL value during query execution, Vertica returns a run-time error.
Similarly, IN subqueries nested within another expression are not supported if any column values are NULL. For example, if in the following statement column x
from either table contains a NULL value, Vertica returns a run-time error:
=> SELECT * FROM t1 WHERE (x IN (SELECT x FROM t2)) IS FALSE;
ERROR: NULL value found in a column used by a subquery
EXISTS versus IN
Whether you use EXISTS or IN subqueries depends on which predicates you select in outer and inner query blocks. For example, the following query gets a list of all the orders placed by all stores on January 2, 2007 for vendors with records in the vendor table:
=> SELECT store_key, order_number, date_ordered
FROM store.store_orders_fact WHERE EXISTS
(SELECT 1 FROM public.vendor_dimension vd JOIN store.store_orders_fact ord ON vd.vendor_key = ord.vendor_key)
AND date_ordered = '2007-01-02';
store_key | order_number | date_ordered
-----------+--------------+--------------
114 | 271071 | 2007-01-02
19 | 290888 | 2007-01-02
132 | 58942 | 2007-01-02
232 | 9286 | 2007-01-02
126 | 224474 | 2007-01-02
196 | 63482 | 2007-01-02
...
196 | 83327 | 2007-01-02
138 | 278373 | 2007-01-02
179 | 293586 | 2007-01-02
155 | 213413 | 2007-01-02
(506 rows)
The above query looks for existence of the vendor and date ordered. To return a particular value, rather than simple existence, the query looks for orders placed by the vendor who got the best deal on January 2, 2007:
=> SELECT store_key, order_number, date_ordered, vendor_name
FROM store.store_orders_fact ord JOIN public.vendor_dimension vd ON ord.vendor_key = vd.vendor_key
WHERE vd.deal_size IN (SELECT MAX(deal_size) FROM public.vendor_dimension) AND date_ordered = '2007-01-02';
store_key | order_number | date_ordered | vendor_name
-----------+--------------+--------------+----------------------
50 | 99234 | 2007-01-02 | Everything Wholesale
81 | 200802 | 2007-01-02 | Everything Wholesale
115 | 13793 | 2007-01-02 | Everything Wholesale
204 | 41842 | 2007-01-02 | Everything Wholesale
133 | 169025 | 2007-01-02 | Everything Wholesale
163 | 208580 | 2007-01-02 | Everything Wholesale
29 | 154972 | 2007-01-02 | Everything Wholesale
145 | 236790 | 2007-01-02 | Everything Wholesale
249 | 54838 | 2007-01-02 | Everything Wholesale
7 | 161536 | 2007-01-02 | Everything Wholesale
(10 rows)
Examples
The following SELECT statement queries all data in table t11
.
=> SELECT * FROM t11 ORDER BY pk;
pk | col1 | col2 | SKIP_ME_FLAG
----+------+------+--------------
1 | 2 | 3 | t
2 | 3 | 4 | t
3 | 4 | 5 | f
4 | 5 | 6 | f
5 | 6 | 7 | t
6 | | 8 | f
7 | 8 | | t
(7 rows)
The following query specifies an IN
predicate, to find all rows in t11
where columns col1
and col2
contain values of (2,3)
or (6,7)
:
=> SELECT * FROM t11 WHERE (col1, col2) IN ((2,3), (6,7)) ORDER BY pk;
pk | col1 | col2 | SKIP_ME_FLAG
----+------+------+--------------
1 | 2 | 3 | t
5 | 6 | 7 | t
(2 rows)
The following query uses the VMart schema to illustrate the use of outer expressions referring to different inner expressions:
=> SELECT product_description, product_price FROM product_dimension
WHERE (product_dimension.product_key, product_dimension.product_key) IN
(SELECT store.store_orders_fact.order_number,
store.store_orders_fact.quantity_ordered
FROM store.store_orders_fact);
product_description | product_price
-----------------------------+---------------
Brand #73 wheechair | 454
Brand #72 box of candy | 326
Brand #71 vanilla ice cream | 270
(3 rows)