复杂联接语义
优化器使用一组关键字来表达复杂交叉联接和半联接。所有复杂联接都由关键字 COMPLEX
指示,该关键字插入在关键字 JOIN
之前,例如 CROSS COMPLEX JOIN
。复杂半联接的语义有一个额外的要求,在下面有详细说明。
复杂交叉联接
Vertica 使用关键字短语 CROSS COMPLEX JOIN
来描述所有的复杂交叉联接。例如:
输入查询
SELECT
(SELECT max(sales_quantity) FROM store.store_sales_fact) *
(SELECT max(sales_quantity) FROM online_sales.online_sales_fact);
查询计划
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT
(SELECT max(sales_quantity) FROM store.store_sales_fact) *
(SELECT max(sales_quantity) FROM online_sales.online_sales_fact);
Access Path:
+-JOIN (CROSS JOIN) [Cost: 4K, Rows: 1 (NO STATISTICS)] (PATH ID: 1)
| Execute on: Query Initiator
| +-- Outer -> JOIN (CROSS JOIN) [Cost: 2K, Rows: 1 (NO STATISTICS)] (PATH ID: 2)
| | Execute on: Query Initiator
| | +-- Outer -> STORAGE ACCESS for dual [Cost: 10, Rows: 1] (PATH ID: 3)
| | | Projection: v_catalog.dual_p
| | | Materialize: dual.dummy
| | | Execute on: Query Initiator
| | +-- Inner -> SELECT [Cost: 2K, Rows: 1 (NO STATISTICS)] (PATH ID: 4)
| | | Execute on: Query Initiator
| | | +---> GROUPBY NOTHING [Cost: 2K, Rows: 1 (NO STATISTICS)] (PATH ID: 5)
| | | | Aggregates: max(store_sales_fact.sales_quantity)
| | | | Execute on: All Nodes
| | | | +---> STORAGE ACCESS for store_sales_fact [Cost: 1K, Rows: 5M (NO STATISTICS)] (PATH ID: 6)
| | | | | Projection: store.store_sales_fact_super
| | | | | Materialize: store_sales_fact.sales_quantity
| | | | | Execute on: All Nodes
| +-- Inner -> SELECT [Cost: 2K, Rows: 1 (NO STATISTICS)] (PATH ID: 7)
| | Execute on: Query Initiator
| | +---> GROUPBY NOTHING [Cost: 2K, Rows: 1 (NO STATISTICS)] (PATH ID: 8)
| | | Aggregates: max(online_sales_fact.sales_quantity)
| | | Execute on: All Nodes
| | | +---> STORAGE ACCESS for online_sales_fact [Cost: 1K, Rows: 5M (NO STATISTICS)] (PATH ID: 9)
| | | | Projection: online_sales.online_sales_fact_super
| | | | Materialize: online_sales_fact.sales_quantity
| | | | Execute on: All Nodes
优化器生成的带注释查询
以下带注释的查询返回与前面显示的输入查询相同的结果。与优化器生成的所有带注释查询一样,您可以直接在 vsql 中执行此查询(无论是按照编写还是修改后):
SELECT /*+syntactic_join,verbatim*/ (subQ_1.max * subQ_2.max) AS "?column?"
FROM ((v_catalog.dual AS dual CROSS COMPLEX JOIN /*+Distrib(L,L),JType(H)*/
(SELECT max(store_sales_fact.sales_quantity) AS max
FROM store.store_sales_fact AS store_sales_fact/*+projs('store.store_sales_fact')*/) AS subQ_1 )
CROSS COMPLEX JOIN /*+Distrib(L,L),JType(H)*/ (SELECT max(online_sales_fact.sales_quantity) AS max
FROM online_sales.online_sales_fact AS online_sales_fact/*+projs('online_sales.online_sales_fact')*/) AS subQ_2 )
复杂半联接
复杂半联接由以下关键字之一表达:
-
SEMI COMPLEX JOIN
-
NULLAWARE ANTI COMPLEX JOIN
-
SEMIALL COMPLEX JOIN
-
ANTI COMPLEX JOIN
额外要求适用于所有复杂半联接:每个子查询的 SELECT
列表都以调用 Vertica 元函数 complex_join_marker()
的虚拟列(标记为 false
)结尾。当子查询处理每一行时,complex_join_marker()
返回 true
或 false
以指示结果集中包含或排除该行。结果集与这个标志一起返回到外部查询,外部查询可以使用来自这个和其他子查询中的标志来筛选它自己的结果集。
例如,查询优化器将以下输入查询重写为 NULLAWARE ANTI COMPLEX JOIN
。该联接返回子查询中其 complex_join_marker()
标志设置为适当布尔值的所有行。
输入查询
SELECT product_dimension.product_description FROM public.product_dimension
WHERE (NOT (product_dimension.product_key NOT IN (SELECT inventory_fact.qty_in_stock FROM public.inventory_fact)));
查询计划
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT product_dimension.product_description FROM public.product_dimension
WHERE (NOT (product_dimension.product_key NOT IN (SELECT inventory_fact.qty_in_stock FROM public.inventory_fact)));
Access Path:
+-JOIN HASH [Anti][NotInAnti] [Cost: 3K, Rows: 30K] (PATH ID: 1) Inner (BROADCAST)
| Join Cond: (product_dimension.product_key = VAL(2))
| Materialize at Output: product_dimension.product_description
| Filter: (NOT VAL(2))
| Execute on: All Nodes
| +-- Outer -> STORAGE ACCESS for product_dimension [Cost: 56, Rows: 60K] (PATH ID: 2)
| | Projection: public.product_dimension_super
| | Materialize: product_dimension.product_key
| | Execute on: All Nodes
| +-- Inner -> SELECT [Cost: 248, Rows: 300K] (PATH ID: 3)
| | Execute on: All Nodes
| | +---> STORAGE ACCESS for inventory_fact [Cost: 248, Rows: 300K] (PATH ID: 4)
| | | Projection: public.inventory_fact_super
| | | Materialize: inventory_fact.qty_in_stock
| | | Execute on: All Nodes
优化器生成的带注释查询
以下带注释的查询返回与前面显示的输入查询相同的结果。与优化器生成的所有带注释查询一样,您可以直接在 vsql 中执行此查询(无论是按照编写还是修改后)。例如,您可以通过修改外部查询的谓词对 subQ_1."false"
标志的求值方式来控制外部查询的输出。
SELECT /*+syntactic_join,verbatim*/ product_dimension.product_description AS product_description
FROM (public.product_dimension AS product_dimension/*+projs('public.product_dimension')*/
NULLAWARE ANTI COMPLEX JOIN /*+Distrib(L,B),JType(H)*/
(SELECT inventory_fact.qty_in_stock AS qty_in_stock, complex_join_marker() AS "false"
FROM public.inventory_fact AS inventory_fact/*+projs('public.inventory_fact')*/) AS subQ_1
ON (product_dimension.product_key = subQ_1.qty_in_stock)) WHERE (NOT subQ_1."false")