Half-join subquery semantics
The Vertica optimizer uses several keywords in directed queries to recreate half-join subqueries with certain search operators, such as ANY or NOT IN.
SEMI JOIN
Recreates a query that contains a subquery preceded by an
IN
,
EXISTS
, or
ANY
operator and executes a semi-join.
Input query
SELECT product_description FROM product_dimension
WHERE product_dimension.product_key IN (SELECT qty_in_stock from inventory_fact);
Query plan
QUERY PLAN DESCRIPTION:
------------------------------
explain SELECT product_description FROM product_dimension WHERE product_dimension.product_key IN (SELECT qty_in_stock from inventory_fact);
Access Path:
+-JOIN HASH [Semi] [Cost: 1K, Rows: 30K] (PATH ID: 1) Outer (FILTER) Inner (RESEGMENT)
| Join Cond: (product_dimension.product_key = VAL(2))
| Materialize at Output: product_dimension.product_description
| Execute on: All Nodes
| +-- Outer -> STORAGE ACCESS for product_dimension [Cost: 152, Rows: 60K] (PATH ID: 2)
| | Projection: public.product_dimension
| | Materialize: product_dimension.product_key
| | Execute on: All Nodes
| | Runtime Filter: (SIP1(HashJoin): product_dimension.product_key)
| +-- Inner -> SELECT [Cost: 248, Rows: 300K] (PATH ID: 3)
| | Execute on: All Nodes
| | +---> STORAGE ACCESS for inventory_fact [Cost: 248, Rows: 300K] (PATH ID: 4)
| | | Projection: public.inventory_fact_b0
| | | Materialize: inventory_fact.qty_in_stock
| | | Execute on: All Nodes
Optimizer-generated annotated query
SELECT /*+ syntactic_join */ product_dimension.product_description AS product_description
FROM (public.product_dimension AS product_dimension/*+projs('public.product_dimension')*/
SEMI JOIN /*+Distrib(F,R),JType(H)*/ (SELECT inventory_fact.qty_in_stock AS qty_in_stock
FROM public.inventory_fact AS inventory_fact/*+projs('public.inventory_fact')*/) AS subQ_1
ON (product_dimension.product_key = subQ_1.qty_in_stock))
NULLAWARE ANTI JOIN
Recreates a query that contains a subquery preceded by a NOT IN or !=ALL operator, and executes a null-aware anti-join.
Input query
SELECT product_description FROM product_dimension
WHERE product_dimension.product_key NOT IN (SELECT qty_in_stock from inventory_fact);
Query plan
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT product_description FROM product_dimension WHERE product_dimension.product_key not IN (SELECT qty_in_sto
ck from inventory_fact);
Access Path:
+-JOIN HASH [Anti][NotInAnti] [Cost: 7K, Rows: 30K] (PATH ID: 1) Inner (BROADCAST)
| Join Cond: (product_dimension.product_key = VAL(2))
| Materialize at Output: product_dimension.product_description
| Execute on: Query Initiator
| +-- Outer -> STORAGE ACCESS for product_dimension [Cost: 152, Rows: 60K] (PATH ID: 2)
| | Projection: public.product_dimension_DBD_2_rep_VMartDesign
| | Materialize: product_dimension.product_key
| | Execute on: Query Initiator
| +-- Inner -> SELECT [Cost: 248, Rows: 300K] (PATH ID: 3)
| | Execute on: All Nodes
| | +---> STORAGE ACCESS for inventory_fact [Cost: 248, Rows: 300K] (PATH ID: 4)
| | | Projection: public.inventory_fact_DBD_9_seg_VMartDesign_b0
| | | Materialize: inventory_fact.qty_in_stock
| | | Execute on: All Nodes
Optimizer-generated annotated query
SELECT /*+ syntactic_join */ product_dimension.product_description AS product_description
FROM (public.product_dimension AS product_dimension/*+projs('public.product_dimension')*/
NULLAWARE ANTI JOIN /*+Distrib(L,B),JType(H)*/ (SELECT inventory_fact.qty_in_stock AS qty_in_stock
FROM public.inventory_fact AS inventory_fact/*+projs('public.inventory_fact')*/) AS subQ_1
ON (product_dimension.product_key = subQ_1.qty_in_stock))
SEMIALL JOIN
Recreates a query that contains a subquery preceded by an
ALL
operator, and executes a semi-all join.
Input query
SELECT product_key, product_description FROM product_dimension
WHERE product_dimension.product_key > ALL (SELECT product_key from inventory_fact);
Query plan
QUERY PLAN DESCRIPTION:
------------------------------
explain SELECT product_key, product_description FROM product_dimension WHERE product_dimension.product_key > ALL (SELECT product_key from inventory_fact);
Access Path:
+-JOIN HASH [Semi][All] [Cost: 7M, Rows: 30K] (PATH ID: 1) Outer (FILTER) Inner (BROADCAST)
| Join Filter: (product_dimension.product_key > VAL(2))
| Materialize at Output: product_dimension.product_description
| Execute on: All Nodes
| +-- Outer -> STORAGE ACCESS for product_dimension [Cost: 152, Rows: 60K] (PATH ID: 2)
| | Projection: public.product_dimension
| | Materialize: product_dimension.product_key
| | Execute on: All Nodes
| +-- Inner -> SELECT [Cost: 248, Rows: 300K] (PATH ID: 3)
| | Execute on: All Nodes
| | +---> STORAGE ACCESS for inventory_fact [Cost: 248, Rows: 300K] (PATH ID: 4)
| | | Projection: public.inventory_fact_b0
| | | Materialize: inventory_fact.product_key
| | | Execute on: All Nodes
Optimizer-generated annotated query
SELECT /*+ syntactic_join */ product_dimension.product_key AS product_key, product_dimension.product_description AS product_description
FROM (public.product_dimension AS product_dimension/*+projs('public.product_dimension')*/
SEMIALL JOIN /*+Distrib(F,B),JType(H)*/ (SELECT inventory_fact.product_key AS product_key FROM public.inventory_fact AS inventory_fact/*+projs('public.inventory_fact')*/) AS subQ_1
ON (product_dimension.product_key > subQ_1.product_key))
ANTI JOIN
Recreates a query that contains a subquery preceded by a
NOT EXISTS
operator, and executes an anti-join.
Input query
SELECT product_key, product_description FROM product_dimension
WHERE NOT EXISTS (SELECT inventory_fact.product_key FROM inventory_fact
WHERE inventory_fact.product_key=product_dimension.product_key);
Query plan
QUERY PLAN DESCRIPTION:
------------------------------
explain SELECT product_key, product_description FROM product_dimension WHERE NOT EXISTS (SELECT inventory_fact.product_
key FROM inventory_fact WHERE inventory_fact.product_key=product_dimension.product_key);
Access Path:
+-JOIN HASH [Anti] [Cost: 703, Rows: 30K] (PATH ID: 1) Outer (FILTER)
| Join Cond: (VAL(1) = product_dimension.product_key)
| Materialize at Output: product_dimension.product_description
| Execute on: All Nodes
| +-- Outer -> STORAGE ACCESS for product_dimension [Cost: 152, Rows: 60K] (PATH ID: 2)
| | Projection: public.product_dimension_DBD_2_rep_VMartDesign
| | Materialize: product_dimension.product_key
| | Execute on: All Nodes
| +-- Inner -> SELECT [Cost: 248, Rows: 300K] (PATH ID: 3)
| | Execute on: All Nodes
| | +---> STORAGE ACCESS for inventory_fact [Cost: 248, Rows: 300K] (PATH ID: 4)
| | | Projection: public.inventory_fact_DBD_9_seg_VMartDesign_b0
| | | Materialize: inventory_fact.product_key
| | | Execute on: All Nodes
Optimizer-generated annotated query
SELECT /*+ syntactic_join */ product_dimension.product_key AS product_key, product_dimension.product_description AS product_description
FROM (public.product_dimension AS product_dimension/*+projs('public.product_dimension')*/
ANTI JOIN /*+Distrib(F,L),JType(H)*/ (SELECT inventory_fact.product_key AS "inventory_fact.product_key"
FROM public.inventory_fact AS inventory_fact/*+projs('public.inventory_fact')*/) AS subQ_1
ON (subQ_1."inventory_fact.product_key" = product_dimension.product_key))