This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Subqueries used in search conditions

Subqueries are used as search conditions in order to filter results.

Subqueries are used as search conditions in order to filter results. They specify the conditions for the rows returned from the containing query's select-list, a query expression, or the subquery itself. The operation evaluates to TRUE, FALSE, or UNKNOWN (NULL).

Syntax

search-condition {
    [ { AND | OR | NOT } {  predicate | ( search-condition ) } ]
   }[,... ]
 predicate
     { expression comparison-operator expression
         | string-expression [ NOT ] { LIKE | ILIKE | LIKEB | ILIKEB } string-expression
         | expression IS [ NOT ] NULL
         | expression [ NOT ] IN ( subquery | expression[,... ] )
         | expression comparison-operator [ ANY | SOME ] ( subquery )
         | expression comparison-operator ALL ( subquery )
         | expression OR ( subquery )
         | [ NOT ] EXISTS ( subquery )
         | [ NOT ] IN ( subquery )
     }

Parameters

search-condition

Specifies the search conditions for the rows returned from one of the following:

  • Containing query's select-list

  • Query expression

  • Subquery

If the subquery is used with an UPDATE or DELETE statement, UPDATE specifies the rows to update and DELETE specifies the rows to delete.

{ AND | OR | NOT }

Keywords that specify the logical operators that combine conditions, or in the case of NOT, negate conditions.

  • AND: Combines two conditions and evaluates to TRUE when both of the conditions are TRUE.

  • OR: Combines two conditions and evaluates to TRUE when either condition is TRUE.

  • NOT: Negates the Boolean expression specified by the predicate.

predicate An expression that returns TRUE, FALSE, or UNKNOWN (NULL).
expression A column name, constant, function, or scalar subquery, or combination of column names, constants, and functions connected by operators or subqueries.
comparison-operator

An operator that tests conditions between two expressions, one of the following:

This operator... Tests whether...
< The first expression is less than the second.
> The first expression is greater than the second.
<= The first expression is less than or equal to the second expression.
>= The first expression is greater than or equal to the second expression.
= Two expressions are equal.
<=> Like the = operator, but returns TRUE (instead of UNKNOWN) if both expressions evaluate to UNKNOWN, and FALSE (instead of UNKNOWN) if one expression evaluates to UNKNOWN.

<>

Two expressions are not equal.

!=

string-expression A character string with optional wildcard (*) characters.
[ NOT ] { LIKE | ILIKE | LIKEB | ILIKEB } Indicates that the character string following the predicate is to be used (or not used) for pattern matching.
IS [ NOT ] NULL Searches for values that are null or are not null.
ALL Used with a comparison operator and a subquery. Returns TRUE for the left-hand predicate if all values returned by the subquery satisfy the comparison operation, or FALSE if not all values satisfy the comparison or if the subquery returns no rows to the outer query block.
ANY | SOME ANY and SOME are synonyms and are used with a comparison operator and a subquery. Either returns TRUE for the left-hand predicate if any value returned by the subquery satisfies the comparison operation, or FALSE if no values in the subquery satisfy the comparison or if the subquery returns no rows to the outer query block. Otherwise, the expression is UNKNOWN.
[ NOT ] EXISTS Used with a subquery to test for the existence of records that the subquery returns.
[ NOT ] IN Searches for an expression on the basis of an expression's exclusion or inclusion from a list. The list of values is enclosed in parentheses and can be a subquery or a set of constants.

1 - Logical operators AND and OR

The AND and OR logical operators combine two conditions.

The AND and OR logical operators combine two conditions. AND evaluates to TRUE when both of the conditions joined by the AND keyword are matched, and OR evaluates to TRUE when either condition joined by OR is matched.

OR subqueries (complex expressions)

Vertica supports subqueries in more complex expressions using OR; for example:

  • More than one subquery in the conjunct expression:

    (SELECT MAX(b) FROM t1) + SELECT (MAX FROM t2) a IN (SELECT a FROM t1) OR b IN (SELECT x FROM t2)
    
  • An OR clause in the conjunct expression involves at least one subquery:

    a IN (SELECT a FROM t1) OR b IN (SELECT x FROM t2) a IN (SELECT a from t1) OR b = 5
    a = (SELECT MAX FROM t2) OR b = 5
    
  • One subquery is present but it is part of a another expression:

    x IN (SELECT a FROM t1) = (x = (SELECT MAX FROM t2) (x IN (SELECT a FROM t1) IS NULL
    

How AND queries are evaluated

Vertica treats expressions separated by AND (conjunctive) operators individually. For example if the WHERE clause were:

  WHERE (a IN (SELECT a FROM t1) OR b IN (SELECT x FROM t2)) AND (c IN (SELECT a FROM t1))

the query would be interpreted as two conjunct expressions:

  1. (a IN (SELECT a FROM t1) OR b IN (SELECT x FROM t2))

  2. (c IN (SELECT a FROM t1))

The first expression is considered a complex subquery, whereas the second expression is not.

Examples

The following list shows some of the ways you can filter complex conditions in the WHERE clause:

  • OR expression between a subquery and a non-subquery condition:

    => SELECT x FROM t WHERE x > (SELECT SUM(DISTINCT x) FROM t GROUP BY y) OR x < 9;
    
  • OR expression between two subqueries:

    => SELECT * FROM t WHERE x=(SELECT x FROM t) OR EXISTS(SELECT x FROM tt);
    
  • Subquery expression:

    => SELECT * FROM t WHERE x=(SELECT x FROM t)+1 OR x<>(SELECT x FROM t)+1;
    
  • OR expression with [NOT] IN subqueries:

    => SELECT * FROM t WHERE NOT (EXISTS (SELECT x FROM t)) OR x >9;
    
  • OR expression with IS [NOT] NULL subqueries:

    => SELECT * FROM t WHERE (SELECT * FROM t)IS NULL OR (SELECT * FROM tt)IS NULL;
    
  • OR expression with boolean column and subquery that returns Boolean data type:

    => SELECT * FROM t2 WHERE x = (SELECT x FROM t2) OR x;
    
  • OR expression in the CASE statement:

    => SELECT * FROM t WHERE CASE WHEN x=1 THEN x > (SELECT * FROM t)
           OR x < (SELECT * FROM t2) END ;
    
  • Analytic function, NULL-handling function, string function, math function, and so on:

    => SELECT x FROM t WHERE x > (SELECT COALESCE (x,y) FROM t GROUP BY x,y) OR
           x < 9;
    
  • In user-defined functions (assuming f() is one):

    => SELECT * FROM t WHERE x > 5 OR x = (SELECT f(x) FROM t);
    
  • Use of parentheses at different places to restructure the queries:

    => SELECT x FROM t WHERE (x = (SELECT x FROM t) AND y = (SELECT y FROM t))
           OR (SELECT x FROM t) =1;
    
  • Multicolumn subqueries:

    => SELECT * FROM t WHERE (x,y) = (SELECT x,y FROM t) OR x > 5;
    
  • Constant/NULL on lefthand side of subquery:

    => SELECT * FROM t WHERE x > 5 OR 5 = (SELECT x FROM t);
    

See also

2 - In place of an expression

Subqueries that return a single value (unlike a list of values returned by IN subqueries) can be used just about anywhere an expression is allowed in SQL.

Subqueries that return a single value (unlike a list of values returned by IN subqueries) can be used just about anywhere an expression is allowed in SQL. It can be a column name, a constant, a function, a scalar subquery, or a combination of column names, constants, and functions connected by operators or subqueries.

For example:

=> SELECT c1 FROM t1 WHERE c1 = ANY (SELECT c1 FROM t2) ORDER BY c1;
=> SELECT c1 FROM t1 WHERE COALESCE((t1.c1 > ANY (SELECT c1 FROM t2)), TRUE);
=> SELECT c1 FROM t1 GROUP BY c1 HAVING
     COALESCE((t1.c1 <> ALL (SELECT c1 FROM t2)), TRUE);

Multi-column expressions are also supported:

=> SELECT c1 FROM t1 WHERE (t1.c1, t1.c2) = ALL (SELECT c1, c2 FROM t2);
=> SELECT c1 FROM t1 WHERE (t1.c1, t1.c2) <> ANY (SELECT c1, c2 FROM t2);

Vertica returns an error on queries where more than one row would be returned by any subquery used as an expression:

=> SELECT c1 FROM t1 WHERE c1 = (SELECT c1 FROM t2) ORDER BY c1;
   ERROR:  more than one row returned by a subquery used as an expression

See also

3 - Comparison operators

Vertica supports Boolean subquery expressions in the WHERE clause with any of the following operators:.

Vertica supports Boolean subquery expressions in the WHERE clause with any of the following operators:

> < >= <= = <> <=>

WHERE clause subqueries filter results and take the following form:

SELECT <column, ...> FROM <table>
WHERE <condition> (SELECT <column, ...> FROM <table> WHERE <condition>);

These conditions are available for all data types where comparison makes sense. All Comparison operators are binary operators that return values of TRUE, FALSE, or UNKNOWN (NULL).

Expressions that correlate to just one outer table in the outer query block are supported, and these correlated expressions can be comparison operators.

The following subquery scenarios are supported:

SELECT * FROM T1 WHERE T1.x =  (SELECT MAX(c1) FROM T2);
SELECT * FROM T1 WHERE T1.x >= (SELECT MAX(c1) FROM T2 WHERE T1.y = T2.c2);
SELECT * FROM T1 WHERE T1.x <= (SELECT MAX(c1) FROM T2 WHERE T1.y = T2.c2);

See also

Subquery restrictions

4 - LIKE pattern matching

Vertica supports LIKE pattern-matching conditions in subqueries and take the following form:.

Vertica supports LIKE pattern-matching conditions in subqueries and take the following form:

string-expression [ NOT ] { LIKE | ILIKE | LIKEB | ILIKEB } string-expression

The following command searches for customers whose company name starts with "Ev" and returns the total count:

=> SELECT COUNT(*) FROM customer_dimension WHERE customer_name LIKE
      (SELECT 'Ev%' FROM customer_dimension LIMIT 1);
 count
-------
   153
(1 row)

Vertica also supports single-row subqueries as the pattern argument for LIKEB and ILIKEB predicates; for example:

=> SELECT * FROM t1 WHERE t1.x LIKEB (SELECT t2.x FROM t2);

The following symbols are substitutes for the LIKE keywords:

~~    LIKE
~#    LIKEB
~~*   ILIKE
~#*   ILIKEB
!~~   NOT LIKE
!~#   NOT LIKEB
!~~*  NOT ILIKE
!~#*  NOT IILIKEB

See LIKE predicate for additional examples.

5 - ANY and ALL

You typically use comparison operators (=, >, < , etc.) only on subqueries that return one row.

You typically use comparison operators (=, >, < , etc.) only on subqueries that return one row. With ANY and ALL operators, you can make comparisons on subqueries that return multiple rows.

These subqueries take the following form:

expression comparison-operator { ANY | ALL } (subquery)

ANY and ALL evaluate whether any or all of the values returned by a subquery match the left-hand expression.

Equivalent operators

You can use following operators instead of ANY or ALL:

This operator... Is equivalent to:
SOME ANY
IN = ANY
NOT IN <> ALL

Example data

Examples below use the following tables and data:

CREATE TABLE t1 (c1 int, c2 VARCHAR(8)); CREATE TABLE t2 (c1 int, c2 VARCHAR(8));
=> SELECT * FROM t1 ORDER BY c1;
 c1 | c2
----+-----
  1 | cab
  1 | abc
  2 | fed
  2 | def
  3 | ihg
  3 | ghi
  4 | jkl
  5 | mno
(8 rows)
=> SELECT * FROM t2 ORDER BY c1;
 c1 | c2
----+-----
  1 | abc
  2 | fed
  3 | jkl
  3 | stu
  3 | zzz
(5 rows)

ANY subqueries

Subqueries that use the ANY keyword return true when any value retrieved in the subquery matches the value of the left-hand expression.

Examples

An ANY subquery within an expression:

=> SELECT c1, c2 FROM t1 WHERE COALESCE((t1.c1 > ANY (SELECT c1 FROM t2)));
 c1 | c2
----+-----
  2 | fed
  2 | def
  3 | ihg
  3 | ghi
  4 | jkl
  5 | mno
(6 rows)

ANY noncorrelated subqueries without aggregates:

=> SELECT c1 FROM t1 WHERE c1 = ANY (SELECT c1 FROM t2) ORDER BY c1;
 c1
----
  1
  1
  2
  2
  3
  3
(6 rows)

ANY noncorrelated subqueries with aggregates:


=> SELECT c1, c2 FROM t1 WHERE c1 <> ANY (SELECT MAX(c1) FROM t2) ORDER BY c1;
 c1 | c2
----+-----
  1 | cab
  1 | abc
  2 | fed
  2 | def
  4 | jkl
  5 | mno
(6 rows)

=> SELECT c1 FROM t1 GROUP BY c1 HAVING c1 <> ANY (SELECT MAX(c1) FROM t2) ORDER BY c1;
 c1
----
  1
  2
  4
  5
(4 rows)

ANY noncorrelated subqueries with aggregates and a GROUP BY clause:


=> SELECT c1, c2 FROM t1 WHERE c1 <> ANY (SELECT MAX(c1) FROM t2 GROUP BY c2) ORDER BY c1;
 c1 | c2
----+-----
  1 | cab
  1 | abc
  2 | fed
  2 | def
  3 | ihg
  3 | ghi
  4 | jkl
  5 | mno
(8 rows)

ANY noncorrelated subqueries with a GROUP BY clause:

=> SELECT c1, c2 FROM t1 WHERE c1 <=> ANY (SELECT c1 FROM t2 GROUP BY c1) ORDER BY c1;
 c1 | c2
----+-----
  1 | cab
  1 | abc
  2 | fed
  2 | def
  3 | ihg
  3 | ghi
(6 rows)

ANY correlated subqueries with no aggregates or GROUP BY clause:

=> SELECT c1, c2 FROM t1 WHERE c1 >= ANY (SELECT c1 FROM t2 WHERE t2.c2 = t1.c2) ORDER BY c1;
 c1 | c2
----+-----
  1 | abc
  2 | fed
  4 | jkl
(3 rows)

ALL subqueries

A subquery that uses the ALL keyword returns true when all values retrieved by the subquery match the left-hand expression, otherwise it returns false.

Examples

ALL noncorrelated subqueries without aggregates:

=> SELECT c1, c2 FROM t1 WHERE c1 >= ALL (SELECT c1 FROM t2) ORDER BY c1;
 c1 | c2
----+-----
  3 | ihg
  3 | ghi
  4 | jkl
  5 | mno
(4 rows)

ALL noncorrelated subqueries with aggregates:

=> SELECT c1, c2 FROM t1 WHERE c1 = ALL (SELECT MAX(c1) FROM t2) ORDER BY c1;
 c1 | c2
----+-----
  3 | ihg
  3 | ghi
(2 rows)

=> SELECT c1 FROM t1 GROUP BY c1 HAVING c1 <> ALL (SELECT MAX(c1) FROM t2) ORDER BY c1;
 c1
----
  1
  2
  4
  5
(4 rows)

ALL noncorrelated subqueries with aggregates and a GROUP BY clause:


=> SELECT c1, c2 FROM t1 WHERE c1 <= ALL (SELECT MAX(c1) FROM t2 GROUP BY c2) ORDER BY c1;
 c1 | c2
----+-----
  1 | cab
  1 | abc
(2 rows)

ALL noncorrelated subqueries with a GROUP BY clause:

=> SELECT c1, c2 FROM t1 WHERE c1 <> ALL (SELECT c1 FROM t2 GROUP BY c1) ORDER BY c1;
 c1 | c2
----+-----
  4 | jkl
  5 | mno
(2 rows)

NULL handling

Vertica supports multicolumn <> ALL subqueries where the columns are not marked NOT NULL. If any column contains a NULL value, Vertica returns a run-time error.

Vertica does not support = ANY subqueries that are nested within another expression if any column values are NULL.

See also

Subquery restrictions

6 - EXISTS and NOT EXISTS

The EXISTS predicate is one of the most common predicates used to build conditions that use noncorrelated and correlated subqueries.

The EXISTS predicate is one of the most common predicates used to build conditions that use noncorrelated and correlated subqueries. Use EXISTS to identify the existence of a relationship without regard for the quantity. For example, EXISTS returns true if the subquery returns any rows, and [NOT] EXISTS returns true if the subquery returns no rows.

[NOT] EXISTS subqueries take the following form:

expression [ NOT ] EXISTS ( subquery )

The EXISTS condition is considered to be met if the subquery returns at least one row. Since the result depends only on whether any records are returned, and not on the contents of those records, the output list of the subquery is normally uninteresting. A common coding convention is to write all EXISTS tests as follows:

EXISTS (SELECT 1 WHERE ...)

In the above fragment, SELECT 1 returns the value 1 for every record in the query. If the query returns, for example, five records, it returns 5 ones. The system doesn't care about the real values in those records; it just wants to know if a row is returned.

Alternatively, a subquery’s select list that uses EXISTS might consist of the asterisk (*). You do not need to specify column names, because the query tests for the existence or nonexistence of records that meet the conditions specified in the subquery.

EXISTS (SELECT * WHERE ...)

Notes

  • If EXISTS (subquery) returns at least 1 row, the result is TRUE.

  • If EXISTS (subquery) returns no rows, the result is FALSE.

  • If NOT EXISTS (subquery) returns at least 1 row, the result is FALSE.

  • If NOT EXISTS (subquery) returns no rows, the result is TRUE.

Examples

The following query retrieves the list of all the customers who purchased anything from any of the stores amounting to more than 550 dollars:

=> SELECT customer_key, customer_name, customer_state
   FROM public.customer_dimension WHERE EXISTS
     (SELECT 1 FROM store.store_sales_fact
      WHERE customer_key = public.customer_dimension.customer_key
      AND sales_dollar_amount > 550)
   AND customer_state = 'MA' ORDER BY customer_key;
 customer_key |   customer_name    | customer_state
--------------+--------------------+----------------
        14818 | William X. Nielson | MA
        18705 | James J. Goldberg  | MA
        30231 | Sarah N. McCabe    | MA
        48353 | Mark L. Brown      | MA
(4 rows)

Whether you use EXISTS or IN subqueries depends on which predicates you select in outer and inner query blocks. For example, to get a list of all the orders placed by all stores on January 2, 2003 for vendors with records in the vendor table:

=> SELECT store_key, order_number, date_ordered
   FROM store.store_orders_fact WHERE EXISTS
     (SELECT 1 FROM public.vendor_dimension
      WHERE public.vendor_dimension.vendor_key = store.store_orders_fact.vendor_key)
   AND date_ordered = '2012-01-02';
 store_key | order_number | date_ordered
-----------+--------------+--------------
        37 |         2559 | 2012-01-02
        16 |          552 | 2012-01-02
        35 |         1156 | 2012-01-02
        13 |         3885 | 2012-01-02
        25 |          554 | 2012-01-02
        21 |         2687 | 2012-01-02
        49 |         3251 | 2012-01-02
        19 |         2922 | 2012-01-02
        26 |         1329 | 2012-01-02
        40 |         1183 | 2012-01-02
(10 rows)

The above query looks for existence of the vendor and date ordered. To return a particular value, rather than simple existence, the query looks for orders placed by the vendor who got the best deal on January 4, 2004:

=> SELECT store_key, order_number, date_ordered
   FROM store.store_orders_fact ord, public.vendor_dimension vd
   WHERE ord.vendor_key = vd.vendor_key AND vd.deal_size IN
      (SELECT MAX(deal_size) FROM public.vendor_dimension)
    AND date_ordered = '2013-01-04';
 store_key | order_number | date_ordered
-----------+--------------+--------------
       166 |        36008 | 2013-01-04
       113 |        66017 | 2013-01-04
       198 |        75716 | 2013-01-04
        27 |       150241 | 2013-01-04
       148 |       182207 | 2013-01-04
         9 |       188567 | 2013-01-04
        45 |       202416 | 2013-01-04
        24 |       250295 | 2013-01-04
       121 |       251417 | 2013-01-04
(9 rows)

See also

7 - IN and NOT IN

While you cannot equate a single value to a set of values, you can check to see if a single value is found within that set of values.

While you cannot equate a single value to a set of values, you can check to see if a single value is found within that set of values. Use the IN clause for multiple-record, single-column subqueries. After the subquery returns results introduced by IN or NOT IN, the outer query uses them to return the final result.

[NOT] IN subqueries take the following form:

{ expression [ NOT ] IN ( subquery )| expression [ NOT ] IN ( expression ) }

There is no limit to the number of parameters passed to the IN clause of the SELECT statement; for example:

=> SELECT * FROM tablename WHERE column IN (a, b, c, d, e, ...);

Vertica also supports queries where two or more outer expressions refer to different inner expressions:

=> SELECT * FROM A WHERE (A.x,A.x) IN (SELECT B.x, B.y FROM B);

Examples

The following query uses the VMart schema to illustrate the use of outer expressions referring to different inner expressions:

=> SELECT product_description, product_price FROM product_dimension
   WHERE (product_dimension.product_key, product_dimension.product_key) IN
      (SELECT store.store_orders_fact.order_number,
         store.store_orders_fact.quantity_ordered
       FROM store.store_orders_fact);
 product_description         | product_price
-----------------------------+---------------
 Brand #72 box of candy      |           326
 Brand #71 vanilla ice cream |           270
(2 rows)

To find all products supplied by stores in MA, first create the inner query and run it to ensure that it works as desired. The following query returns all stores located in MA:

=> SELECT store_key FROM store.store_dimension WHERE store_state = 'MA';
 store_key
-----------
        13
        31
(2 rows)

Then create the outer or main query that specifies all distinct products that were sold in stores located in MA. This statement combines the inner and outer queries using the IN predicate:

=> SELECT DISTINCT s.product_key, p.product_description
   FROM store.store_sales_fact s, public.product_dimension p
   WHERE s.product_key = p.product_key
       AND s.product_version = p.product_version
       AND s.store_key IN
         (SELECT store_key
          FROM store.store_dimension
          WHERE store_state = 'MA')
   ORDER BY s.product_key;
 product_key |          product_description
-------------+---------------------------------------
           1 | Brand #1 white bread
           1 | Brand #4 vegetable soup
           3 | Brand #9 wheelchair
           5 | Brand #15 cheddar cheese
           5 | Brand #19 bleach
           7 | Brand #22 canned green beans
           7 | Brand #23 canned tomatoes
           8 | Brand #24 champagne
           8 | Brand #25 chicken nuggets
          11 | Brand #32 sausage
         ...   ...
(281 rows)

When using NOT IN, the subquery returns a list of zero or more values in the outer query where the comparison column does not match any of the values returned from the subquery. Using the previous example, NOT IN returns all the products that are not supplied from MA.

Notes

Vertica supports multicolumn NOT IN subqueries in which the columns are not marked NOT NULL. If one of the columns is found to contain a NULL value during query execution, Vertica returns a run-time error.

Similarly, IN subqueries nested within another expression are not supported if any of the column values are NULL. For example, if in the following statement column x from either table contained a NULL value, Vertica returns a run-time error:

=> SELECT * FROM t1 WHERE (x IN (SELECT x FROM t2)) IS FALSE;
   ERROR: NULL value found in a column used by a subquery

See also