这是本节的多页打印视图。
点击此处打印.
返回本页常规视图.
定向查询
定向查询封装优化器可用于创建查询计划的信息。定向查询可以实现以下目标:
-
在计划的升级前保留当前查询计划。大多数情况下,升级 Vertica 后,查询的执行效率会变得更高。在少数并非如此的情况下,您可以使用升级之前所创建的定向查询,以便重新创建早期版本的查询计划。
-
使您能够创建改善优化工具性能的查询计划。有时,您可能希望影响优化器,使其在执行给定查询时做出更好的选择。例如,您可以选择不同的投影,或强制执行不同的联接顺序。在这种情况下,可使用定向查询来创建一个查询计划,替代优化器可能另外创建的计划。
-
将输入查询重定向到使用不同语义的查询 — 例如,将联接查询映射到用来查询扁平表的 SELECT 语句。
定向查询组件
定向查询为两个组件配对:
提示
也可以直接在 vsql 中使用大多数优化器提示。有关详细信息,请参阅
提示。
Vertica 提供两种创建定向查询的方法:
有关这两种方法的描述,请参阅创建定向查询。
1 - 创建定向查询
CREATE DIRECTED QUERY 将输入查询与使用优化器提示进行注释的查询相关联。它将此关联存储在一个唯一标识符下。CREATE DIRECTED QUERY 有两种变体:
在这两种情形中,Vertica 都会将带注释的查询与输入查询相关联,并将其关联注册在 query_name
下的系统表 DIRECTED_QUERIES 中。
这两种方法可以一起使用:您可以使用优化器创建的带注释 SQL 作为创建自己的(自定义)定向查询的基础。
1.1 - 优化器生成的定向查询
CREATE DIRECTED QUERY OPTIMIZER 将输入查询传递给优化器,优化器从自己的查询计划生成带注释的查询。然后,它将输入查询和带注释的查询配对并将其保存为定向查询。此定向查询可用于处理其他除了作为查询结果筛选依据的谓词字符串以外完全相同的查询。
可在升级之前使用优化器生成的定向查询来捕获查询计划。如果在升级后检测到给定查询的性能下降,这样做将会非常有用。在这种情况下,可使用相应的定向查询重新创建早期查询计划,并将其性能与当前优化器生成的计划相对比。
示例
以下 SQL 语句创建和激活定向查询 findEmployeesCityJobTitle_OPT
:
=> CREATE DIRECTED QUERY OPTIMIZER 'findEmployeesCityJobTitle_OPT'
SELECT employee_first_name, employee_last_name FROM public.employee_dimension
WHERE employee_city='Boston' and job_title='Cashier' ORDER BY employee_last_name, employee_first_name;
CREATE DIRECTED QUERY
=> ACTIVATE DIRECTED QUERY findEmployeesCityJobTitle_OPT;
ACTIVATE DIRECTED QUERY
在激活此定向查询计划后,优化器会使用它为此输入查询的所有后续调用以及其他类似调用生成查询计划。可通过调用 GET DIRECTED QUERY 或查询系统表 DIRECTED_QUERIES 来查看优化器生成的带注释查询:
=> SELECT input_query, annotated_query FROM V_CATALOG.DIRECTED_QUERIES WHERE query_name = 'findEmployeesCityJobTitle_OPT';
-[ RECORD 1 ]---+----------------------------------------------------------------------------
input_query | SELECT employee_dimension.employee_first_name, employee_dimension.employee_last_name FROM public.employee_dimension
WHERE ((employee_dimension.employee_city = 'Boston'::varchar(6) /*+:v(1)*/) AND (employee_dimension.job_title = 'Cashier'::varchar(7) /*+:v(2)*/))
ORDER BY employee_dimension.employee_last_name, employee_dimension.employee_first_name
annotated_query | SELECT /*+verbatim*/ employee_dimension.employee_first_name AS employee_first_name, employee_dimension.employee_last_name AS employee_last_name FROM public.employee_dimension AS employee_dimension/*+projs('public.employee_dimension')*/
WHERE (employee_dimension.employee_city = 'Boston'::varchar(6) /*+:v(1)*/) AND (employee_dimension.job_title = 'Cashier'::varchar(7) /*+:v(2)*/)
ORDER BY 2 ASC, 1 ASC
带注释的查询包括以下提示:
以下查询对 employee_city
和 job_title
列使用不同的值,但在其他方面与定向查询 EmployeesCityJobTitle_OPT
的原始输入查询相同:
=> SELECT employee_first_name, employee_last_name FROM public.employee_dimension
WHERE employee_city = 'San Francisco' and job_title = 'Branch Manager' ORDER BY employee_last_name, employee_first_name;
如果定向查询 EmployeesCityJobTitle_OPT
处于活动状态,则优化器可以将其用于此查询:
=> EXPLAIN SELECT employee_first_name, employee_last_name FROM employee_dimension WHERE employee_city='San Francisco' AND job_title='Branch Manager' ORDER BY employee_last_name, employee_first_name;
...
------------------------------
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT employee_first_name, employee_last_name FROM employee_dimension WHERE employee_city='San Francisco' AND job_title='Branch Manager' ORDER BY employee_last_name, employee_first_name;
The following active directed query(query name: findEmployeesCityJobTitle_OPT) is being executed:
SELECT /*+verbatim*/ employee_dimension.employee_first_name, employee_dimension.employee_last_name FROM public.employee_dimension employee_dimension/*+projs('public.employee_dimension')*/
WHERE ((employee_dimension.employee_city = 'San Francisco'::varchar(13)) AND (employee_dimension.job_title = 'Branch Manager'::varchar(14)))
ORDER BY employee_dimension.employee_last_name, employee_dimension.employee_first_name
Access Path:
+-SORT [Cost: 222, Rows: 10K (NO STATISTICS)] (PATH ID: 1)
| Order: employee_dimension.employee_last_name ASC, employee_dimension.employee_first_name ASC
| Execute on: All Nodes
| +---> STORAGE ACCESS for employee_dimension [Cost: 60, Rows: 10K (NO STATISTICS)] (PATH ID: 2)
| | Projection: public.employee_dimension_super
| | Materialize: employee_dimension.employee_first_name, employee_dimension.employee_last_name
| | Filter: (employee_dimension.employee_city = 'San Francisco')
| | Filter: (employee_dimension.job_title = 'Branch Manager')
| | Execute on: All Nodes
...
1.2 - 自定义定向查询
CREATE DIRECTED QUERY CUSTOM 指定带注释的查询并将其与 SAVE QUERY 先前保存的输入查询配对。必须在同一用户会话中发出这两个语句。
例如,您可能希望某查询使用特定投影:
-
使用 SAVE QUERY 指定查询:
=> SAVE QUERY SELECT employee_first_name, employee_last_name FROM employee_dimension
WHERE employee_city='Boston' AND job_title='Cashier';
SAVE QUERY
注意
您提供给 SAVE QUERY 的输入查询仅支持
:v
提示。
-
使用 CREATE DIRECTED QUERY CUSTOM 创建自定义定向查询,它指定带注释的查询并将其与保存的查询相关联。带注释的查询包含一个 /*+projs*/
提示,它指示优化器在用户调用保存的查询时使用投影 public.emp_dimension_unseg
:
=> CREATE DIRECTED QUERY CUSTOM 'findBostonCashiers_CUSTOM'
SELECT employee_first_name, employee_last_name
FROM employee_dimension /*+Projs('public.emp_dimension_unseg')*/
WHERE employee_city='Boston' AND job_title='Cashier';
CREATE DIRECTED QUERY
当心
Vertica 将保存的查询与带注释的查询相关联,而不检查输入查询和带注释的查询是否兼容。请小心安排 SAVE QUERY 和 CREATE DIRECTED QUERY CUSTOM 语句的顺序,以使保存的查询和定向查询正确匹配。
-
激活此定向查询:
=> ACTIVATE DIRECTED QUERY findBostonCashiers_CUSTOM;
ACTIVATE DIRECTED QUERY
-
激活后,优化器将使用此定向查询为其输入查询的所有后续调用生成查询计划。下面的 EXPLAIN 输出验证优化器使用此定向查询的情况及其指定的投影:
=> EXPLAIN SELECT employee_first_name, employee_last_name FROM employee_dimension
WHERE employee_city='Boston' AND job_title='Cashier';
QUERY PLAN
------------------------------
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT employee_first_name, employee_last_name FROM employee_dimension where employee_city='Boston' AND job_title='Cashier';
The following active directed query(query name: findBostonCashiers_CUSTOM) is being executed:
SELECT employee_dimension.employee_first_name, employee_dimension.employee_last_name
FROM public.employee_dimension/*+Projs('public.emp_dimension_unseg')*/
WHERE ((employee_dimension.employee_city = 'Boston'::varchar(6)) AND (employee_dimension.job_title = 'Cashier'::varchar(7)))
Access Path:
+-STORAGE ACCESS for employee_dimension [Cost: 158, Rows: 10K (NO STATISTICS)] (PATH ID: 1)
| Projection: public.emp_dimension_unseg
| Materialize: employee_dimension.employee_first_name, employee_dimension.employee_last_name
| Filter: (employee_dimension.employee_city = 'Boston')
| Filter: (employee_dimension.job_title = 'Cashier')
| Execute on: Query Initiator
另请参阅
重写联接查询
1.3 - 结合使用优化器生成的定向查询和自定义定向查询
可以根据优化器所创建的带注释 SQL 来创建自己的自定义定向查询。当评估优化器创建用于处理给定查询的计划以及测试计划修改内容时,此方法特别有用。
例如,您可能希望修改优化器对以下查询的实施方式:
=> SELECT COUNT(customer_name) Total, customer_region Region
FROM (store_sales s JOIN customer_dimension c ON c.customer_key = s.customer_key)
JOIN product_dimension p ON s.product_key = p.product_key
WHERE p.category_description ilike '%Medical%'
AND p.product_description ilike '%antibiotics%'
AND c.customer_age <= 30 AND YEAR(s.sales_date)=2017
GROUP BY customer_region;
对此查询运行 EXPLAIN 时,您会发现优化器将投影 customers_proj_age
用于 customer_dimension
表。此投影按列 customer_age
进行排序。因此,优化器将根据 customer_key
对表 store_sales
和 customer_dimension
进行哈希联接。
分析 customer_dimension
表数据后,您观察到大部分客户的年龄不到 30 岁,因此对按 customer_key
排序的 customer_dimension
表使用投影 customer_proj_id
更有意义:
您可以创建一个用于封装此更改的定向查询,如下所示:
-
使用 EXPLAIN ANNOTATED 获取由优化器生成的查询注释:
=> \o annotatedQuery
=> EXPLAIN ANNOTATED SELECT COUNT(customer_name) Total, customer_region Region
FROM (store_sales s JOIN customer_dimension c ON c.customer_key = s.customer_key)
JOIN product_dimension p ON s.product_key = p.product_key
WHERE p.category_description ilike '%Medical%'
AND p.product_description ilike '%antibiotics%'
AND c.customer_age <= 30 AND YEAR(s.sales_date)=2017
GROUP BY customer_region;
=> \o
=> \! cat annotatedQuery
...
SELECT /*+syntactic_join,verbatim*/ count(c.customer_name) AS Total, c.customer_region AS Region
FROM ((public.store_sales AS s/*+projs('public.store_sales_super')*/
JOIN /*+Distrib(L,B),JType(H)*/ public.customer_dimension AS c/*+projs('public.customers_proj_age')*/
ON (c.customer_key = s.customer_key))
JOIN /*+Distrib(L,B),JType(M)*/ public.product_dimension AS p/*+projs('public.product_dimension')*/
ON (s.product_key = p.product_key))
WHERE ((date_part('year'::varchar(4), (s.sales_date)::timestamp(0)))::int = 2017)
AND (c.customer_age <= 30)
AND ((p.category_description)::varchar(32) ~~* '%Medical%'::varchar(9))
AND (p.product_description ~~* '%antibiotics%'::varchar(13))
GROUP BY /*+GByType(Hash)*/ 2
(4 rows)
-
修改带注释的查询:
SELECT /*+syntactic_join,verbatim*/ count(c.customer_name) AS Total, c.customer_region AS Region
FROM ((public.store_sales AS s/*+projs('public.store_sales_super')*/
JOIN /*+Distrib(L,B),JType(H)*/ public.customer_dimension AS c/*+projs('public.customer_proj_id')*/
ON (c.customer_key = s.customer_key))
JOIN /*+Distrib(L,B),JType(H)*/ public.product_dimension AS p/*+projs('public.product_dimension')*/
ON (s.product_key = p.product_key))
WHERE ((date_part('year'::varchar(4), (s.sales_date)::timestamp(0)))::int = 2017)
AND (c.customer_age <= 30)
AND ((p.category_description)::varchar(32) ~~* '%Medical%'::varchar(9))
AND (p.product_description ~~* '%antibiotics%'::varchar(13))
GROUP BY /*+GByType(Hash)*/ 2
-
使用修改后的带注释查询来创建所需的定向查询:
-
使用 SAVE QUERY 保存所需的输入查询:
=> SAVE QUERY SELECT COUNT(customer_name) Total, customer_region Region
FROM (store_sales s JOIN customer_dimension c ON c.customer_key = s.customer_key)
JOIN product_dimension p ON s.product_key = p.product_key
WHERE p.category_description ilike '%Medical%'
AND p.product_description ilike '%antibiotics%'
AND c.customer_age <= 30 AND YEAR(s.sales_date)=2017
GROUP BY customer_region;
-
创建一个自定义定向查询,将保存的输入查询与修改后的带注释查询相关联:
=> CREATE DIRECTED QUERY CUSTOM 'getCustomersUnder31'
SELECT /*+syntactic_join,verbatim*/ count(c.customer_name) AS Total, c.customer_region AS Region
FROM ((public.store_sales AS s/*+projs('public.store_sales_super')*/
JOIN /*+Distrib(L,B),JType(H)*/ public.customer_dimension AS c/*+projs('public.customer_proj_id')*/
ON (c.customer_key = s.customer_key))
JOIN /*+Distrib(L,B),JType(H)*/ public.product_dimension AS p/*+projs('public.product_dimension')*/
ON (s.product_key = p.product_key))
WHERE ((date_part('year'::varchar(4), (s.sales_date)::timestamp(0)))::int = 2017)
AND (c.customer_age <= 30)
AND ((p.category_description)::varchar(32) ~~* '%Medical%'::varchar(9))
AND (p.product_description ~~* '%antibiotics%'::varchar(13))
GROUP BY /*+GByType(Hash)*/ 2;
CREATE DIRECTED QUERY
-
激活此定向查询:
=> ACTIVATE DIRECTED QUERY getCustomersUnder31;
ACTIVATE DIRECTED QUERY
当优化器处理与此定向查询的输入查询相匹配的查询时,它会使用此定向查询的带注释查询来生成查询计划:
=> EXPLAIN SELECT COUNT(customer_name) Total, customer_region Region
FROM (store_sales s JOIN customer_dimension c ON c.customer_key = s.customer_key)
JOIN product_dimension p ON s.product_key = p.product_key
WHERE p.category_description ilike '%Medical%'
AND p.product_description ilike '%antibiotics%'
AND c.customer_age <= 30 AND YEAR(s.sales_date)=2017
GROUP BY customer_region;
The following active directed query(query name: getCustomersUnder31) is being executed:
...
2 - 设置带注释查询中的提示
定向查询的带注释查询中的提示为优化器提供关于如何执行输入查询的说明。带注释查询支持以下提示:
带注释查询中的其他提示(如 DIRECT 或 LABEL)不起作用。
您可以在 vsql 查询中使用与带注释查询中相同的提示,只有两个例外情况: :v
(IGNORECONSTANT
) 和 VERBATIM
。
3 - 忽略定向查询中的常量
优化器生成的定向查询通常包括一个或多个
:v
(IGNORECONSTANT
的别名)提示,这些提示标记谓词字符串常量,当优化器决定是否对给定输入查询使用定向查询时,您希望优化器忽略这些常量。 :v
提示使多个查询能够使用同一个定向查询,但前提是这些查询在除了其谓词字符串以外的所有其他方面都相同。
例如,在以下两个查询中,除了分别为 employee_city
和 job_title
列指定的字符串常量 Boston|San Francisco
和 Cashier|Branch Manager
外,所有其他方面都相同:
=> SELECT employee_first_name, employee_last_name FROM public.employee_dimension
WHERE employee_city='Boston' and job_title ='Cashier' ORDER BY employee_last_name, employee_first_name;
=> SELECT employee_first_name, employee_last_name FROM public.employee_dimension
WHERE employee_city = 'San Francisco' and job_title = 'Branch Manager' ORDER BY employee_last_name, employee_first_name;
在这种情况下,您从一个查询创建的、由优化器生成的定向查询可用于这两个查询:
=> CREATE DIRECTED QUERY OPTIMIZER 'findEmployeesCityJobTitle_OPT'
SELECT employee_first_name, employee_last_name FROM public.employee_dimension
WHERE employee_city='Boston' and job_title='Cashier' ORDER BY employee_last_name, employee_first_name;
CREATE DIRECTED QUERY
=> ACTIVATE DIRECTED QUERY findEmployeesCityJobTitle_OPT;
ACTIVATE DIRECTED QUERY
定向查询的输入查询和带注释查询都包含 :v
提示:
=> SELECT input_query, annotated_query FROM V_CATALOG.DIRECTED_QUERIES WHERE query_name = 'findEmployeesCityJobTitle_OPT';
-[ RECORD 1 ]---+----------------------------------------------------------------------------
input_query | SELECT employee_dimension.employee_first_name, employee_dimension.employee_last_name FROM public.employee_dimension
WHERE ((employee_dimension.employee_city = 'Boston'::varchar(6) /*+:v(1)*/) AND (employee_dimension.job_title = 'Cashier'::varchar(7) /*+:v(2)*/))
ORDER BY employee_dimension.employee_last_name, employee_dimension.employee_first_name
annotated_query | SELECT /*+verbatim*/ employee_dimension.employee_first_name AS employee_first_name, employee_dimension.employee_last_name AS employee_last_name FROM public.employee_dimension AS employee_dimension/*+projs('public.employee_dimension')*/
WHERE (employee_dimension.employee_city = 'Boston'::varchar(6) /*+:v(1)*/) AND (employee_dimension.job_title = 'Cashier'::varchar(7) /*+:v(2)*/)
ORDER BY 2 ASC, 1 ASC
输入查询和带注释查询中的提示实参将两个谓词常量配对:
:v
提示告诉优化器在决定是否可以将此定向查询用于给定输入查询时忽略这两列的值。
例如,以下查询虽然对 employee_city
和 job_title
使用不同的值,但在其他方面与用于创建定向查询 EmployeesCityJobTitle_OPT
的查询相同:
=> SELECT employee_first_name, employee_last_name FROM public.employee_dimension
WHERE employee_city = 'San Francisco' and job_title = 'Branch Manager' ORDER BY employee_last_name, employee_first_name;
如果定向查询 EmployeesCityJobTitle_OPT
处于活动状态,则优化器可以将该定向查询用在以下查询的查询计划中:
=> EXPLAIN SELECT employee_first_name, employee_last_name FROM employee_dimension WHERE employee_city='San Francisco' AND job_title='Branch Manager' ORDER BY employee_last_name, employee_first_name;
...
------------------------------
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT employee_first_name, employee_last_name FROM employee_dimension WHERE employee_city='San Francisco' AND job_title='Branch Manager' ORDER BY employee_last_name, employee_first_name;
The following active directed query(query name: findEmployeesCityJobTitle_OPT) is being executed:
SELECT /*+verbatim*/ employee_dimension.employee_first_name, employee_dimension.employee_last_name FROM public.employee_dimension employee_dimension/*+projs('public.employee_dimension')*/
WHERE ((employee_dimension.employee_city = 'San Francisco'::varchar(13)) AND (employee_dimension.job_title = 'Branch Manager'::varchar(14)))
ORDER BY employee_dimension.employee_last_name, employee_dimension.employee_first_name
Access Path:
+-SORT [Cost: 222, Rows: 10K (NO STATISTICS)] (PATH ID: 1)
| Order: employee_dimension.employee_last_name ASC, employee_dimension.employee_first_name ASC
| Execute on: All Nodes
| +---> STORAGE ACCESS for employee_dimension [Cost: 60, Rows: 10K (NO STATISTICS)] (PATH ID: 2)
| | Projection: public.employee_dimension_super
| | Materialize: employee_dimension.employee_first_name, employee_dimension.employee_last_name
| | Filter: (employee_dimension.employee_city = 'San Francisco')
| | Filter: (employee_dimension.job_title = 'Branch Manager')
| | Execute on: All Nodes
...
映射一对多 :v 提示
目前所显示的示例演示了 :v
提示的一对一配对。您也可以使用 :v
提示将一个输入常数映射到带注释查询中的多个常数。当您想为优化器提供关于如何执行查询(用来联接表)的显式说明时,这种方法特别有用。
例如,以下查询联接了两个表:
SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 8;
在这种情况下,优化器可以推断 S.a
和 T.b
具有相同的值,并相应地实施联接。
<a name="simpleJoinExample"></a>=> CREATE DIRECTED QUERY OPTIMIZER simpleJoin SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 8;
CREATE DIRECTED QUERY
=> SELECT input_query, annotated_query FROM directed_queries WHERE query_name = 'simpleJoin';
-[ RECORD 1 ]---+---------------------------------------------------------------------------------------------------------------------
input_query | SELECT S.a, T.b FROM (public.S JOIN public.T ON ((S.a = T.b))) WHERE (S.a = 8 /*+:v(1)*/)
annotated_query | SELECT /*+syntactic_join,verbatim*/ S.a AS a, T.b AS b
FROM (public.S AS S/*+projs('public.S')*/ JOIN /*+Distrib(L,L),JType(M)*/ public.T AS T/*+projs('public.T')*/ ON (S.a = T.b))
WHERE (S.a = 8 /*+:v(1)*/) AND (T.b = 8 /*+:v(1)*/)
(1 row)
=> ACTIVATE DIRECTED QUERY simpleJoin;
ACTIVATED DIRECTED QUERY
现在,以下列输入查询为例:
SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 3;
优化器忽略联接谓词常量并使用其查询计划中的定向查询 simpleJoin
:
------------------------------
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 3;
The following active directed query(query name: simpleJoin) is being executed:
SELECT /*+syntactic_join,verbatim*/ S.a, T.b FROM (public.S S/*+projs('public.S')*/ JOIN /*+Distrib('L', 'L'), JType('
M')*/public.T T/*+projs('public.T')*/ ON ((S.a = T.b))) WHERE ((S.a = 3) AND (T.b = 3))
Access Path:
+-JOIN MERGEJOIN(inputs presorted) [Cost: 21, Rows: 4 (NO STATISTICS)] (PATH ID: 1)
| Join Cond: (S.a = T.b)
| Execute on: Query Initiator
| +-- Outer -> STORAGE ACCESS for S [Cost: 12, Rows: 4 (NO STATISTICS)] (PATH ID: 2)
| | Projection: public.S_b0
| | Materialize: S.a
| | Filter: (S.a = 3)
| | Execute on: Query Initiator
| | Runtime Filter: (SIP1(MergeJoin): S.a)
| +-- Inner -> STORAGE ACCESS for T [Cost: 8, Rows: 3 (NO STATISTICS)] (PATH ID: 3)
| | Projection: public.T_b0
| | Materialize: T.b
| | Filter: (T.b = 3)
| | Execute on: Query Initiator
...
在定向查询中保留谓词常量
默认情况下,优化器生成的定向查询在谓词常量上设置 :v
提示。您可以覆盖此行为,方法是使用
:c
提示来标记不得忽略的谓词常量。例如,以下语句创建一个定向查询,该定向查询只能用于联接谓词常量 8
与原始输入查询中相同的输入查询:
=> CREATE DIRECTED QUERY OPTIMIZER simpleJoin_KeepPredicateConstant SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 8 /*+:c*/;
CREATE DIRECTED QUERY
=> ACTIVATE DIRECTED QUERY simpleJoin_KeepPredicateConstant;
以下对系统表 DIRECTED_QUERIES 的查询对定向查询 simpleJoin
(在前面的示例中创建)和 simpleJoin_KeepPredicateConstant
进行了比较。与 simpleJoin
不同,simpleJoin_KeepPredicateConstant
的输入查询和带注释查询的联接谓词省略了 :v
提示:
=> SELECT query_name, input_query, annotated_query FROM directed_queries WHERE query_name ILIKE'%simpleJoin%';
-[ RECORD 1 ]---+
query_name | simpleJoin
input_query | SELECT S.a, T.b FROM (public.S JOIN public.T ON ((S.a = T.b))) WHERE (S.a = 8 /*+:v(1)*/)
annotated_query | SELECT /*+syntactic_join,verbatim*/ S.a AS a, T.b AS b
FROM (public.S AS S/*+projs('public.S')*/ JOIN /*+Distrib(L,L),JType(M)*/ public.T AS T/*+projs('public.T')*/ ON (S.a = T.b))
WHERE (S.a = 8 /*+:v(1)*/) AND (T.b = 8 /*+:v(1)*/)
-[ RECORD 2 ]---+
query_name | simpleJoin_KeepPredicateConstant
input_query | SELECT S.a, T.b FROM (public.S JOIN public.T ON ((S.a = T.b))) WHERE (S.a = 8)
annotated_query | SELECT /*+syntactic_join,verbatim*/ S.a AS a, T.b AS b
FROM (public.S AS S/*+projs('public.S')*/ JOIN /*+Distrib(L,L),JType(M)*/ public.T AS T/*+projs('public.T')*/ ON (S.a = T.b))
WHERE (S.a = 8) AND (T.b = 8)
如果停用定向查询 simpleJoin
(否则该定向查询会优先于 simpleJoin_KeepPredicateConstant
,因为前者创建的早),Vertica 仅将 simpleJoin_KeepPredicateConstant
应用于输入查询,这里联接谓词常量与原始输入查询中相同。例如,比较以下两个查询计划:
=> EXPLAIN SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 8;
...
------------------------------
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 8;
The following active directed query(query name: simpleJoin_KeepPredicateConstant) is being executed:
SELECT /*+syntactic_join,verbatim*/ S.a, T.b FROM (public.S S/*+projs('public.S')*/ JOIN /*+Distrib('L', 'L'), JType('
M')*/public.T T/*+projs('public.T')*/ ON ((S.a = T.b))) WHERE ((S.a = 8) AND (T.b = 8))
Access Path:
+-JOIN MERGEJOIN(inputs presorted) [Cost: 21, Rows: 4 (NO STATISTICS)] (PATH ID: 1)
| Join Cond: (S.a = T.b)
...
=> EXPLAIN SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 3
...
------------------------------
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 3;
Access Path:
+-JOIN MERGEJOIN(inputs presorted) [Cost: 21, Rows: 4 (NO STATISTICS)] (PATH ID: 1)
| Join Cond: (S.a = T.b)
...
4 - 重写查询
您可以使用定向查询来更改给定查询的语义,即用一个查询替换另一个查询。当您对 Vertica 数据库处理的输入查询的内容和格式没有或几乎没有控制权时,这一点尤其重要。您可以将这些查询映射到定向查询,从而重写原始输入以实现最佳执行效果。
以下部分介绍了两个用例:
重写联接查询
许多输入查询联接多个表。您已经确定,在许多情况下,更高效的做法是对多个扁平表中的大量数据进行反向标准化并直接查询这些表。您不能修改输入查询本身。但是,您可以使用定向查询将联接查询重定向到扁平表数据。
例如,以下查询通过联接 VMart 数据库中的三个表来聚合葡萄酒产品的区域销售额:
=> SELECT SD.store_region AS Region, SD.store_city AS City, SUM(SF.gross_profit_dollar_amount) Total
FROM store.store_sales_fact SF
JOIN store.store_dimension SD ON SF.store_key=SD.store_key
JOIN product_dimension P ON SF.product_key||SF.product_version=P.product_key||P.product_version
WHERE P.product_description ILIKE '%wine%'
GROUP BY ROLLUP (SD.store_region, SD.store_city)
ORDER BY Region,Total DESC;
您可以将联接的表数据合并到一个扁平表中,然后改为查询该表。通过这样做,您可以更快地访问同一数据。您可以使用以下 DDL 语句创建扁平表:
=> CREATE TABLE store.store_sales_wide AS SELECT * FROM store.store_sales_fact;
=> ALTER TABLE store.store_sales_wide ADD COLUMN store_name VARCHAR(64)
SET USING (SELECT store_name FROM store.store_dimension WHERE store.store_sales_wide.store_key=store.store_dimension.store_key);
=> ALTER TABLE store.store_sales_wide ADD COLUMN store_city varchar(64)
SET USING (SELECT store_city FROM store.store_dimension WHERE store.store_sales_wide.store_key=store.store_dimension.store_key);
=> ALTER TABLE store.store_sales_wide ADD COLUMN store_state char(2)
SET USING (SELECT store_state char FROM store.store_dimension WHERE store.store_sales_wide.store_key=store.store_dimension.store_key)
=> ALTER TABLE store.store_sales_wide ADD COLUMN store_region varchar(64)
SET USING (SELECT store_region FROM store.store_dimension WHERE store.store_sales_wide.store_key=store.store_dimension.store_key);
=> ALTER TABLE store.store_sales_wide ADD column product_description VARCHAR(128)
SET USING (SELECT product_description FROM public.product_dimension
WHERE store_sales_wide.product_key||store_sales_wide.product_version = product_dimension.product_key||product_dimension.product_version);
=> ALTER TABLE store.store_sales_wide ADD COLUMN sku_number char(32)
SET USING (SELECT sku_number char FROM product_dimension
WHERE store_sales_wide.product_key||store_sales_wide.product_version = product_dimension.product_key||product_dimension.product_version);
=> SELECT REFRESH_COLUMNS ('store.store_sales_wide','', 'rebuild');
创建此表并刷新其 SET USING
列后,可以重写前面的查询,如下所示:
=> SELECT SD.store_region AS Region, SD.store_city AS City, SUM(SF.gross_profit_dollar_amount) Total
FROM store.store_sales_fact SF
JOIN store.store_dimension SD ON SF.store_key=SD.store_key
JOIN product_dimension P ON SF.product_key||SF.product_version=P.product_key||P.product_version
WHERE P.product_description ILIKE '%wine%'
GROUP BY ROLLUP (SD.store_region, SD.store_city)
ORDER BY Region,Total DESC;
Region | City | Total
-----------+------------------+---------
East | | 1679788
East | Boston | 138494
East | Elizabeth | 138071
East | Sterling Heights | 137719
East | Allentown | 137122
East | New Haven | 117751
East | Lowell | 102670
East | Washington | 84595
East | Charlotte | 83255
East | Waterbury | 81516
East | Erie | 80784
East | Stamford | 59935
East | Hartford | 59843
East | Baltimore | 55873
East | Clarksville | 54117
East | Nashville | 53757
East | Manchester | 53290
East | Columbia | 52799
East | Memphis | 52648
East | Philadelphia | 29711
East | Portsmouth | 29316
East | New York | 27033
East | Cambridge | 26111
East | Alexandria | 23378
MidWest | | 1073224
MidWest | Lansing | 145616
MidWest | Livonia | 129349
--More--
查询扁平表更高效;但是,您仍然必须考虑继续使用早期联接语法的输入查询。这可以通过创建自定义定向查询来完成,这会将这些输入查询重定向到以扁平表为目标的语法:
-
保存输入查询:
=> SAVE QUERY SELECT SD.store_region AS Region, SD.store_city AS City, SUM(SF.gross_profit_dollar_amount) Total
FROM store.store_sales_fact SF
JOIN store.store_dimension SD ON SF.store_key=SD.store_key
JOIN product_dimension P ON SF.product_key||SF.product_version=P.product_key||P.product_version
WHERE P.product_description ILIKE '%wine%'
GROUP BY ROLLUP (SD.store_region, SD.store_city)
ORDER BY Region,Total DESC;
SAVE QUERY
-
将保存的查询映射到具有所需语法的定向查询,然后激活定向查询:
=> CREATE DIRECTED QUERY CUSTOM 'RegionalSalesWine'
SELECT store_region AS Region,
store_city AS City,
SUM(gross_profit_dollar_amount) AS Total
FROM store.store_sales_wide
WHERE product_description ILIKE '%wine%'
GROUP BY ROLLUP (region, city)
ORDER BY Region,Total DESC;
CREATE DIRECTED QUERY
=> ACTIVATE DIRECTED QUERY RegionalSalesWine;
ACTIVATE DIRECTED QUERY
当定向查询 RegionalSalesWine
处于活动状态时,查询优化器会将所有与原始输入格式相匹配的查询映射到定向查询,如下面的查询计划所示:
=> EXPLAIN SELECT SD.store_region AS Region, SD.store_city AS City, SUM(SF.gross_profit_dollar_amount) Total
FROM store.store_sales_fact SF
JOIN store.store_dimension SD ON SF.store_key=SD.store_key
JOIN product_dimension P ON SF.product_key||SF.product_version=P.product_key||P.product_version
WHERE P.product_description ILIKE '%wine%'
GROUP BY ROLLUP (SD.store_region, SD.store_city)
ORDER BY Region,Total DESC;
...
The following active directed query(query name: RegionalSalesWine) is being executed:
SELECT store_sales_wide.store_region AS Region, store_sales_wide.store_city AS City, sum(store_sales_wide.gross_profit_dollar_amount) AS Total
FROM store.store_sales_wide WHERE (store_sales_wide.product_description ~~* '%wine%'::varchar(6))
GROUP BY GROUPING SETS((store_sales_wide.store_region, store_sales_wide.store_city), (store_sales_wide.store_region),())
ORDER BY store_sales_wide.store_region, sum(store_sales_wide.gross_profit_dollar_amount) DESC
Access Path:
+-SORT [Cost: 2K, Rows: 10K (NO STATISTICS)] (PATH ID: 1)
| Order: store_sales_wide.store_region ASC, sum(store_sales_wide.gross_profit_dollar_amount) DESC
| Execute on: All Nodes
| +---> GROUPBY HASH (GLOBAL RESEGMENT GROUPS) (LOCAL RESEGMENT GROUPS) [Cost: 2K, Rows: 10K (NO STATISTICS)] (PATH ID: 2)
| | Aggregates: sum(store_sales_wide.gross_profit_dollar_amount)
| | Group By: store_sales_wide.store_region, store_sales_wide.store_city
| | Grouping Sets: (store_sales_wide.store_region, store_sales_wide.store_city, <SVAR>), (store_sales_wide.store_region, <SVAR>), (<SVAR>)
| | Execute on: All Nodes
| | +---> STORAGE ACCESS for store_sales_wide [Cost: 864, Rows: 10K (NO STATISTICS)] (PATH ID: 3)
| | | Projection: store.store_sales_wide_b0
| | | Materialize: store_sales_wide.gross_profit_dollar_amount, store_sales_wide.store_city, store_sales_wide.store_region
| | | Filter: (store_sales_wide.product_description ~~* '%wine%')
| | | Execute on: All Nodes
要对定向查询和原始输入查询的执行成本进行比较,请停用定向查询并对原始输入查询使用 EXPLAIN。优化器恢复到为输入查询创建计划,这会产生高得多的成本 - 188K 相对于 2K:
=> DEACTIVATE DIRECTED QUERY RegionalSalesWine;
DEACTIVATE DIRECTED QUERY
=> EXPLAIN SELECT SD.store_region AS Region, SD.store_city AS City, SUM(SF.gross_profit_dollar_amount) Total
FROM store.store_sales_fact SF
JOIN store.store_dimension SD ON SF.store_key=SD.store_key
JOIN product_dimension P ON SF.product_key||SF.product_version=P.product_key||P.product_version
WHERE P.product_description ILIKE '%wine%'
GROUP BY ROLLUP (SD.store_region, SD.store_city)
ORDER BY Region,Total DESC;
...
Access Path:
+-SORT [Cost: 188K, Rows: 10K (NO STATISTICS)] (PATH ID: 1)
| Order: SD.store_region ASC, sum(SF.gross_profit_dollar_amount) DESC
| Execute on: All Nodes
| +---> GROUPBY HASH (GLOBAL RESEGMENT GROUPS) (LOCAL RESEGMENT GROUPS) [Cost: 188K, Rows: 10K (NO STATISTICS)] (PATH ID: 2)
| | Aggregates: sum(SF.gross_profit_dollar_amount)
| | Group By: SD.store_region, SD.store_city
| | Grouping Sets: (SD.store_region, SD.store_city, <SVAR>), (SD.store_region, <SVAR>), (<SVAR>)
| | Execute on: All Nodes
| | +---> JOIN HASH [Cost: 12K, Rows: 5M (NO STATISTICS)] (PATH ID: 3) Inner (BROADCAST)
| | | Join Cond: (concat((SF.product_key)::varchar, (SF.product_version)::varchar) = concat((P.product_key)::varchar, (P.product_version)::varchar))
| | | Materialize at Input: SF.product_key, SF.product_version
| | | Materialize at Output: SF.gross_profit_dollar_amount
| | | Execute on: All Nodes
| | | +-- Outer -> JOIN HASH [Cost: 2K, Rows: 5M (NO STATISTICS)] (PATH ID: 4) Inner (BROADCAST)
| | | | Join Cond: (SF.store_key = SD.store_key)
| | | | Execute on: All Nodes
| | | | +-- Outer -> STORAGE ACCESS for SF [Cost: 1K, Rows: 5M (NO STATISTICS)] (PATH ID: 5)
| | | | | Projection: store.store_sales_fact_super
| | | | | Materialize: SF.store_key
| | | | | Execute on: All Nodes
| | | | | Runtime Filters: (SIP2(HashJoin): SF.store_key), (SIP1(HashJoin): concat((SF.product_key)::varchar, (SF.product_version)::varchar))
| | | | +-- Inner -> STORAGE ACCESS for SD [Cost: 13, Rows: 250 (NO STATISTICS)] (PATH ID: 6)
| | | | | Projection: store.store_dimension_super
| | | | | Materialize: SD.store_key, SD.store_city, SD.store_region
| | | | | Execute on: All Nodes
| | | +-- Inner -> STORAGE ACCESS for P [Cost: 201, Rows: 60K (NO STATISTICS)] (PATH ID: 7)
| | | | Projection: public.product_dimension_super
| | | | Materialize: P.product_key, P.product_version
| | | | Filter: (P.product_description ~~* '%wine%')
| | | | Execute on: All Nodes
创建查询模板
您可以使用定向查询来实施除了用于筛选查询结果的谓词字符串以外都完全相同的多个查询。例如,定向查询 RegionalSalesWine
仅处理对包含字符串 wine
的 product_description
值进行筛选的输入查询。您可以创建此定向查询的修改版本,使其匹配多个输入查询的语法,这些输入查询仅在输入值(例如 tuna
)上有所不同。
按以下步骤创建此查询模板:
-
创建两个优化器生成的定向查询:
-
从对联接表的原始查询:
=> CREATE DIRECTED QUERY OPTIMIZER RegionalSalesProducts_JoinTables
SELECT SD.store_region AS Region, SD.store_city AS City, SUM(SF.gross_profit_dollar_amount) Total
FROM store.store_sales_fact SF
JOIN store.store_dimension SD ON SF.store_key=SD.store_key
JOIN product_dimension P ON SF.product_key||SF.product_version=P.product_key||P.product_version
WHERE P.product_description ILIKE '%wine%'
GROUP BY ROLLUP (SD.store_region, SD.store_city)
ORDER BY Region,Total DESC;
CREATE DIRECTED QUERY
-
从对扁平表的查询:
=> CREATE DIRECTED QUERY OPTIMIZER RegionalSalesProduct
SELECT store_region AS Region, store_city AS City, SUM(gross_profit_dollar_amount) AS Total
FROM store.store_sales_wide
WHERE product_description ILIKE '%wine%'
GROUP BY ROLLUP (region, city)
ORDER BY Region,Total DESC;
CREATE DIRECTED QUERY
-
查询系统表 DIRECTED_QUERIES 并复制定向查询 RegionalSalesProducts_JoinTables
的输入查询:
SELECT input_query FROM directed_queries WHERE query_name = 'RegionalSalesProducts_JoinTables';
-
通过 SAVE QUERY 使用复制的输入查询:
SAVE QUERY SELECT SD.store_region AS Region, SD.store_city AS City, sum(SF.gross_profit_dollar_amount) AS Total
FROM ((store.store_sales_fact SF
JOIN store.store_dimension SD ON ((SF.store_key = SD.store_key)))
JOIN public.product_dimension P ON ((concat((SF.product_key)::varchar, (SF.product_version)::varchar) = concat((P.product_key)::varchar, (P.product_version)::varchar))))
WHERE (P.product_description ~~* '%wine%'::varchar(6) /*+:v(1)*/)
GROUP BY GROUPING SETS((SD.store_region, SD.store_city), (SD.store_region), ())
ORDER BY SD.store_region, sum(SF.gross_profit_dollar_amount) DESC
(1 row)
-
查询系统表 DIRECTED_QUERIES 并复制定向查询 RegionalSalesProducts_FlatTables
的带注释查询:
SELECT input_query FROM directed_queries WHERE query_name = 'RegionalSalesProducts_JoinTables';
-
使用复制的带注释查询创建自定义定向查询:
=> CREATE DIRECTED QUERY CUSTOM RegionalSalesProduct SELECT /*+verbatim*/ store_sales_wide.store_region AS Region, store_sales_wide.store_city AS City, sum(store_sales_wide.gross_profit_dollar_amount) AS Total
FROM store.store_sales_wide AS store_sales_wide/*+projs('store.store_sales_wide')*/
WHERE (store_sales_wide.product_description ~~* '%wine%'::varchar(6) /*+:v(1)*/)
GROUP BY /*+GByType(Hash)*/ GROUPING SETS((1, 2), (1), ())
ORDER BY 1 ASC, 3 DESC;
CREATE DIRECTED QUERY
-
激活此定向查询:
ACTIVATE DIRECTED QUERY RegionalSalesProduct;
激活此定向查询后,Vertica 可以将其用于和模板相匹配的输入查询中,不同之处仅在于 product_description
的谓词值:
=> EXPLAIN SELECT SD.store_region AS Region, SD.store_city AS City, SUM(SF.gross_profit_dollar_amount) Total
FROM store.store_sales_fact SF
JOIN store.store_dimension SD ON SF.store_key=SD.store_key
JOIN product_dimension P ON SF.product_key||SF.product_version=P.product_key||P.product_version
WHERE P.product_description ILIKE '%tuna%'
GROUP BY ROLLUP (SD.store_region, SD.store_city)
ORDER BY Region,Total DESC;
...
The following active directed query(query name: RegionalSalesProduct) is being executed:
SELECT /*+verbatim*/ store_sales_wide.store_region AS Region, store_sales_wide.store_city AS City, sum(store_sales_wide.gross_profit_dollar_amount) AS Total
FROM store.store_sales_wide store_sales_wide/*+projs('store.store_sales_wide')*/
WHERE (store_sales_wide.product_description ~~* '%tuna%'::varchar(6))
GROUP BY /*+GByType(Hash)*/ GROUPING SETS((store_sales_wide.store_region, store_sales_wide.store_city), (store_sales_wide.store_region), ())
ORDER BY store_sales_wide.store_region, sum(store_sales_wide.gross_profit_dollar_amount) DESC
Access Path:
+-SORT [Cost: 2K, Rows: 10K (NO STATISTICS)] (PATH ID: 1)
| Order: store_sales_wide.store_region ASC, sum(store_sales_wide.gross_profit_dollar_amount) DESC
| Execute on: All Nodes
| +---> GROUPBY HASH (GLOBAL RESEGMENT GROUPS) (LOCAL RESEGMENT GROUPS) [Cost: 2K, Rows: 10K (NO STATISTICS)] (PATH ID: 2)
| | Aggregates: sum(store_sales_wide.gross_profit_dollar_amount)
| | Group By: store_sales_wide.store_region, store_sales_wide.store_city
| | Grouping Sets: (store_sales_wide.store_region, store_sales_wide.store_city, <SVAR>), (store_sales_wide.store_region, <SVAR>), (<SVAR>)
| | Execute on: All Nodes
| | +---> STORAGE ACCESS for store_sales_wide [Cost: 864, Rows: 10K (NO STATISTICS)] (PATH ID: 3)
| | | Projection: store.store_sales_wide_b0
| | | Materialize: store_sales_wide.gross_profit_dollar_amount, store_sales_wide.store_city, store_sales_wide.store_region
| | | Filter: (store_sales_wide.product_description ~~* '%tuna%')
| | | Execute on: All Nodes
当您执行此查询时,它会返回以下结果:
Region | City | Total
-----------+------------------+---------
East | | 1564582
East | Elizabeth | 131328
East | Allentown | 129704
East | Boston | 128188
East | Sterling Heights | 125877
East | Lowell | 112133
East | New Haven | 101161
East | Waterbury | 85401
East | Washington | 76127
East | Erie | 73002
East | Charlotte | 67850
East | Memphis | 53650
East | Clarksville | 53416
East | Hartford | 52583
East | Columbia | 51950
East | Nashville | 50031
East | Manchester | 48607
East | Baltimore | 48108
East | Stamford | 47302
East | New York | 30840
East | Portsmouth | 26485
East | Alexandria | 26391
East | Philadelphia | 23092
East | Cambridge | 21356
MidWest | | 980209
MidWest | Lansing | 130044
MidWest | Livonia | 118740
--More--
5 - 管理定向查询
Vertica 提供了许多管理定向查询的方法:
5.1 - 获取定向查询
可按以下两种方式获取有关定向查询的编录信息:
GET DIRECTED QUERY
GET DIRECTED QUERY 在指定输入查询上查询系统表 DIRECTED_QUERIES。它返回映射到输入查询的所有定向查询的详细信息。
以下 GET DIRECTED QUERY 语句返回映射到以下输入查询 findEmployeesCityJobTitle_OPT
的定向查询:
=> GET DIRECTED QUERY SELECT employee_first_name, employee_last_name from employee_dimension where employee_city='Boston' AND job_title='Cashier' order by employee_last_name, employee_first_name;
-[ RECORD 1 ]---+
query_name | findEmployeesCityJobTitle_OPT
is_active | t
vertica_version | Vertica Analytic Database v11.0.1-20210815
comment | Optimizer-generated directed query
creation_date | 2021-08-20 14:53:42.323963
annotated_query | SELECT /*+verbatim*/ employee_dimension.employee_first_name, employee_dimension.employee_last_name FROM public.employee_dimension employee_dimension/*+projs('public.employee_dimension')*/ WHERE ((employee_dimension.employee_city = 'Boston'::varchar(6) /*+:v(1)*/) AND (employee_dimension.job_title = 'Cashier'::varchar(7) /*+:v(2)*/)) ORDER BY employee_dimension.employee_last_name, employee_dimension.employee_first_name
DIRECTED_QUERIES
可以直接查询系统表 DIRECTED_QUERIES。例如:
=> SELECT query_name, is_active FROM V_CATALOG.DIRECTED_QUERIES WHERE query_name ILIKE '%findEmployeesCityJobTitle%';
query_name | is_active
-------------------------------+-----------
findEmployeesCityJobTitle_OPT | t
(1 row)
当心
字段 INPUT_QUERY 和 ANNOTATED_QUERY 的查询结果在 ~32K 个字符之后会被截断。可以使用两种方法获取这两个字段的完整内容:
5.2 - 识别处于活动状态的定向查询
多个定向查询可以映射到同一输入查询。系统表 DIRECTED_QUERIES 中的 is_active
列标识处于活动状态的定向查询。如果多个定向查询对于同一输入查询均处于活动状态,优化器则使用要创建的第一个定向查询。在这种情况下,可使用 EXPLAIN 识别哪个定向查询处于活动状态。
提示
对于给定输入查询,一次仅激活一个定向查询是非常好的做法。
DIRECTED_QUERIES 上的以下查询查找所有处于活动状态的定向查询,且其中的输入查询包含被查询表 employee_dimension
的名称:
=> SELECT * FROM directed_queries WHERE input_query ILIKE ('%employee_dimension%') AND is_active='t';
-[ RECORD 1 ]------+
query_name | findEmployeesCityJobTitle_OPT
is_active | t
vertica_version | Vertica Analytic Database v11.0.1-20210815
comment | Optimizer-generated directed query
save_plans_version | 0
username | dbadmin
creation_date | 2021-08-20 14:53:42.323963
since_date |
input_query | SELECT employee_dimension.employee_first_name, employee_dimension.employee_last_name FROM public.employee_dimension WHERE ((employee_dimension.employee_city = 'Boston'::varchar(6) /*+:v(1)*/) AND (employee_dimension.job_title = 'Cashier'::varchar(7) /*+:v(2)*/)) ORDER BY employee_dimension.employee_last_name, employee_dimension.employee_first_name
annotated_query | SELECT /*+verbatim*/ employee_dimension.employee_first_name, employee_dimension.employee_last_name FROM public.employee_dimension employee_dimension/*+projs('public.employee_dimension')*/ WHERE ((employee_dimension.employee_city = 'Boston'::varchar(6) /*+:v(1)*/) AND (employee_dimension.job_title = 'Cashier'::varchar(7) /*+:v(2)*/)) ORDER BY employee_dimension.employee_last_name, employee_dimension.employee_first_name
如果对输入查询运行 EXPLAIN,它将返回一个确认使用 findEmployeesCityJobTitle_OPT
作为活动定向查询的查询计划:
=> EXPLAIN SELECT employee_first_name, employee_last_name FROM employee_dimension WHERE employee_city='Boston' AND job_title ='Cashier' ORDER BY employee_last_name, employee_first_name;
The following active directed query(query name: findEmployeesCityJobTitle_OPT) is being executed:
SELECT /*+verbatim*/ employee_dimension.employee_first_name, employee_dimension.employee_last_name FROM public.employee_dimension employee_dimension/*+projs('public.employee_dimension')*/ WHERE ((employee_dimension.employee_city = 'Boston'::varchar(6)) AND (employee_dimension.job_title = 'Cashier'::varchar(7))) ORDER BY employee_dimension.employee_last_name, employee_dimension.employee_first_name
5.3 - 激活和停用定向查询
优化器仅使用处于活动状态的定向查询。如果多个定向查询共享同一输入查询,优化器将使用要创建的第一个查询。
定向查询分别使用 ACTIVATE DIRECTED QUERY 和 DEACTIVATE DIRECTED QUERY 来激活和停用。例如,以下 ACTIVATE DIRECTED QUERY 语句停用 RegionalSalesProducts_JoinTables
并激活 RegionalSalesProduct
:
=> DEACTIVATE DIRECTED QUERY RegionalSalesProducts_JoinTables;
DEACTIVATE DIRECTED QUERY;
=> ACTIVATE DIRECTED QUERY RegionalSalesProduct;
ACTIVATE DIRECTED QUERY;
Vertica 为所有会话中的给定查询使用处于活动状态的定向查询,除非它被 DEACTIVATE DIRECTED QUERY 明确停用或被 DROP DIRECTED QUERY 从存储空间中移除。如果在数据库关闭时定向查询处于活动状态,当您重新启动数据库时,Vertica 会自动将其重新激活。
定向查询停用后,查询优化器通过使用另一个定向查询(如果存在这样的定向查询)处理输入查询的后续调用。否则,它会生成自己的查询计划。
5.4 - 从编录中导出定向查询
提示
也可以将查询计划作为定向查询导出到外部 SQL 文件。请参阅
批量导出查询计划。
升级到新版本 Vertica 之前,您可以导出定向查询,从而将其用于实现最佳性能对系统而言至关重要的查询。
-
配合使用 EXPORT_CATALOG 和实参 DIRECTED_QUERIES
从数据库编录导出所有当前定向查询及其当前激活状态:
=> SELECT EXPORT_CATALOG('../../export_directedqueries', 'DIRECTED_QUERIES');
EXPORT_CATALOG
-------------------------------------
Catalog data exported successfully
-
EXPORT_CATALOG 创建一个脚本以重新创建定向查询,如以下示例所示:
SAVE QUERY SELECT employee_dimension.employee_first_name, employee_dimension.employee_last_name FROM public.employee_dimension WHERE ((employee_dimension.employee_city = 'Boston'::varchar(6) /*+:v(1)*/) AND (employee_dimension.job_title = 'Cashier'::varchar(7) /*+:v(2)*/)) ORDER BY employee_dimension.employee_last_name, employee_dimension.employee_first_name;
CREATE DIRECTED QUERY CUSTOM findEmployeesCityJobTitle_OPT COMMENT 'Optimizer-generated directed query' OPTVER 'Vertica Analytic Database v11.0.1-20210815' PSDATE '2021-08-20 14:53:42.323963' SELECT /*+verbatim*/ employee_dimension.employee_first_name, employee_dimension.employee_last_name
FROM public.employee_dimension employee_dimension/*+projs('public.employee_dimension')*/
WHERE ((employee_dimension.employee_city = 'Boston'::varchar(6) /*+:v(1)*/) AND (employee_dimension.job_title = 'Cashier'::varchar(7) /*+:v(2)*/))
ORDER BY employee_dimension.employee_last_name, employee_dimension.employee_first_name;
ACTIVATE DIRECTED QUERY findEmployeesCityJobTitle_OPT;
注意
EXPORT_CATALOG
创建的脚本指定使用 CREATE DIRECTED QUERY CUSTOM 重新创建所有定向查询,而不管其最初是如何创建的。
-
升级完成后,使用 DROP DIRECTED QUERY 从数据库编录中移除每个定向查询。或者,编辑导出脚本并在每个 CREATE DIRECTED QUERY 语句之前插入 DROP DIRECTED QUERY 语句。例如,您可以使用粗体显示的更改来修改先前生成的脚本:
SAVE QUERY SELECT employee_dimension.employee_first_name, ... ;
DROP DIRECTED QUERY findEmployeesCityJobTitle_OPT;
CREATE DIRECTED QUERY CUSTOM findEmployeesCityJobTitle_OPT COMMENT 'Optimizer-generated ... ;
ACTIVATE DIRECTED QUERY findEmployeesCityJobTitle_OPT;
-
当您运行此脚本时,Vertica 将重新创建定向查询并还原其激活状态:
=> \i /home/dbadmin/export_directedqueries
SAVE QUERY
DROP DIRECTED QUERY
CREATE DIRECTED QUERY
ACTIVATE DIRECTED QUERY
5.5 - 删除定向查询
DROP DIRECTED QUERY 从数据库编录中移除指定的定向查询。如果该定向查询处于活动状态,Vertica 会在移除前将其停用。
例如:
=> DROP DIRECTED QUERY findBostonCashiers_CUSTOM;
DROP DIRECTED QUERY
6 - 批量导出查询计划
在升级到新 Vertica 版本之前,您可能希望使用定向查询来保存查询计划,以便可能在新数据库中重新使用。您无法预测哪些查询计划可能会重新使用,因此可能要保存很多(或全部)数据库查询的查询计划。但是,每天运行的查询有数百个。通过重复调用 CREATE DIRECTED QUERY 将每个查询计划保存到数据库编录是不切实际的。此外,这样做会显著增加编录大小,可能还会影响性能。
在此情况下,可绕过数据库编录并将查询计划作为定向查询批量导出到外部 SQL 文件。通过为查询计划存储卸载,可以保存当前数据库中任意数量的查询计划,而不影响编录大小和性能。升级之后,您可以决定在新数据库中保留哪些查询计划,并有选择地导入相应的定向查询。
Vertica 提供一组支持此方法的元函数:
6.1 - 导出定向查询
可以将任意数量的查询计划作为定向查询批量导出到外部 SQL 文件中,如下所示:
-
创建一个 SQL 文件,其中包含要保存其查询计划的输入查询。请参阅下文的输入格式。
-
对该 SQL 文件调用元函数 EXPORT_DIRECTED_QUERIES。此元函数具有两个实参:
例如,以下 EXPORT_DIRECTED_QUERIES 语句指定了输入文件 inputQueries
和输出文件 outputQueries
:
=> SELECT EXPORT_DIRECTED_QUERIES('/home/dbadmin/inputQueries','/home/dbadmin/outputQueries');
EXPORT_DIRECTED_QUERIES
---------------------------------------------------------------------------------------------
1 queries successfully exported.
Queries exported to /home/dbadmin/outputQueries.
(1 row)
输入文件
您提供给 EXPORT_DIRECTED_QUERIES 的输入文件包含一个或多个输入查询。对于每个输入查询,您可以选择指定两个字段,这两个字段将用于生成的定向查询中:
您可以按如下方式设置每个输入查询的格式:
--DirQueryName=query-name
--DirQueryComment='comment'
input-query
例如,一个文件可以指定一个输入查询,如下所示:
/* Query: findEmployeesCityJobTitle_OPT */
/* Comment: This query finds all employees of a given city and job title, ordered by employee name */
SELECT employee_dimension.employee_first_name, employee_dimension.employee_last_name, employee_dimension.job_title FROM public.employee_dimension WHERE (employee_dimension.employee_city = 'Boston'::varchar(6)) ORDER BY employee_dimension.job_title;
输出文件
EXPORT_DIRECTED_QUERIES
会将生成用于创建定向查询的 SQL,并将此 SQL 写入到指定文件或标准输出中。对于这两种情况,输出都遵循以下格式:
/* Query: directed-query-name */
/* Comment: directed-query-comment */
SAVE QUERY input-query;
CREATE DIRECTED QUERY CUSTOM 'directed-query-name'
COMMENT 'directed-query-comment'
OPTVER 'vertica-release-num'
PSDATE 'timestamp'
annotated-query
例如,给定前面的输入时,Vertica 会将以下输出写入 /home/dbadmin/outputQueries
:
/* Query: findEmployeesCityJobTitle_OPT */
/* Comment: This query finds all employees of a given city and job title, ordered by employee name */
SAVE QUERY SELECT employee_dimension.employee_first_name, employee_dimension.employee_last_name FROM public.employee_dimension WHERE ((employee_dimension.employee_city = 'Boston'::varchar(6) /*+:v(1)*/) AND (employee_dimension.job_title = 'Cashier'::varchar(7) /*+:v(2)*/)) ORDER BY employee_dimension.employee_last_name, employee_dimension.employee_first_name;
CREATE DIRECTED QUERY CUSTOM 'findEmployeesCityJobTitle_OPT'
COMMENT 'This query finds all employees of a given city and job title, ordered by employee name'
OPTVER 'Vertica Analytic Database v11.1.0-20220102'
PSDATE '2022-01-06 13:45:17.430254'
SELECT /*+verbatim*/ employee_dimension.employee_first_name AS employee_first_name, employee_dimension.employee_last_name AS employee_last_name
FROM public.employee_dimension AS employee_dimension/*+projs('public.employee_dimension')*/
WHERE (employee_dimension.employee_city = 'Boston'::varchar(6) /*+:v(1)*/) AND (employee_dimension.job_title = 'Cashier'::varchar(7) /*+:v(2)*/)
ORDER BY 2 ASC, 1 ASC;
如果给定的输入查询省略了 DirQueryName
和 DirQueryComment
字段,则 EXPORT_DIRECTED_QUERIES 会自动生成以下输出:
例如,以下输入文件包含一个 SELECT
语句,并且省略 DirQueryName
和 DirQueryComment
字段:
SELECT employee_dimension.employee_first_name, employee_dimension.employee_last_name
FROM public.employee_dimension WHERE (employee_dimension.employee_city = 'Boston'::varchar(6))
ORDER BY employee_dimension.job_title;
给定这个文件,EXPORT_DIRECTED_QUERIES 会返回一个关于缺少输入字段的警告,它还会将该警告写入错误文件:
> SELECT EXPORT_DIRECTED_QUERIES('/home/dbadmin/inputQueries2','/home/dbadmin/outputQueries3');
EXPORT_DIRECTED_QUERIES
--------------------------------------------------------------------------------------------------------------
1 queries successfully exported.
1 warning message was generated.
Queries exported to /home/dbadmin/outputQueries3.
See error report, /home/dbadmin/outputQueries3.err for details.
(1 row)
输出文件包含以下内容:
/* Query: Autoname:2022-01-06 14:11:23.071559.0 */
/* Comment: Optimizer-generated directed query */
SAVE QUERY SELECT employee_dimension.employee_first_name, employee_dimension.employee_last_name FROM public.employee_dimension WHERE (employee_dimension.employee_city = 'Boston'::varchar(6) /*+:v(1)*/) ORDER BY employee_dimension.job_title;
CREATE DIRECTED QUERY CUSTOM 'Autoname:2022-01-06 14:11:23.071559.0'
COMMENT 'Optimizer-generated directed query'
OPTVER 'Vertica Analytic Database v11.1.0-20220102'
PSDATE '2022-01-06 14:11:23.071559'
SELECT /*+verbatim*/ employee_dimension.employee_first_name AS employee_first_name, employee_dimension.employee_last_name AS employee_last_name
FROM public.employee_dimension AS employee_dimension/*+projs('public.employee_dimension')*/
WHERE (employee_dimension.employee_city = 'Boston'::varchar(6) /*+:v(1)*/)
ORDER BY employee_dimension.job_title ASC;
错误文件
如果在执行 EXPORT_DIRECTED_QUERIES 期间发生任何错误或警告,它会返回一条类似于以下内容的消息:
1 queries successfully exported.
1 warning message was generated.
Queries exported to /home/dbadmin/outputQueries.
See error report, /home/dbadmin/outputQueries.err for details.
EXPORT_DIRECTED_QUERIES 会将所有错误或警告都写入到一个在与输出文件相同的路径下所创建的文件中,并使用输出文件的基名。
在上一示例中,输出文件名为 /home/dbadmin/outputQueries
,因此 EXPORT_DIRECTED_QUERIES 会将错误写入 /home/dbadmin/outputQueries.err
。
错误文件可以捕获许多错误和警告,例如,可以捕获 EXPORT_DIRECTED_QUERIES 无法创建定向查询的所有实例。在以下示例中,错误文件包含一个警告,指示没有为指定的输入查询提供名称字段,并且还记录了自动为该查询生成的名称:
----------------------------------------------------------------------------------------------------
WARNING: Name field not supplied. Using auto-generated name: 'Autoname:2016-10-13 09:44:33.527548.0'
Input Query: SELECT employee_dimension.employee_first_name, employee_dimension.employee_last_name, employee_dimension.job_title FROM public.employee_dimension WHERE (employee_dimension.employee_city = 'Boston'::varchar(6)) ORDER BY employee_dimension.job_title;
END WARNING
6.2 - 导入定向查询
确定您希望在当前数据库中使用的已导出查询计划后,使用 IMPORT_DIRECTED_QUERIES 将其导入。您为此函数提供您使用 EXPORT_DIRECTED_QUERIES 创建的导出文件的名称以及您希望导入的定向查询的名称。例如:
=> SELECT IMPORT_DIRECTED_QUERIES('/home/dbadmin/outputQueries','FindEmployeesBoston');
IMPORT_DIRECTED_QUERIES
------------------------------------------------------------------------------------------
1 directed queries successfully imported.
To activate a query named 'my_query1':
=> ACTIVATE DIRECTED QUERY 'my_query1';
(1 row)
导入所需的定向查询后,必须使用 ACTIVATE DIRECTED QUERY 将其激活,然后才能使用它们创建查询计划。
7 - 半联接和交叉联接语义
Vertica 优化器在定向查询中使用多个关键字来重新创建交叉联接和半联接子查询。它还支持使用一组额外的关键字来表示复杂交叉联接和半联接。您还可以在直接从 vsql 中执行的查询中使用这些关键字。
当心
这些关键字不符合标准 SQL;它们只能由 Vertica 优化器使用。
有关详细信息,请参阅以下主题:
7.1 - 半联接子查询语义
Vertica 优化器在定向查询中使用多个关键字来重新创建与特定搜索运算符的半联接子查询,例如
ANY
或
NOT IN
:
-
[SEMI JOIN](#SEMI)
-
[NULLAWARE ANTI JOIN](#NULLAWARE)
-
[SEMIALL JOIN](#SEMIALL)
-
[ANTI JOIN](#ANTI)
SEMI JOIN
重新创建用来执行 semi-join 的查询,其中包含以
IN
、
EXIST
或
ANY
运算符开头的子查询。
输入查询
SELECT product_description FROM product_dimension
WHERE product_dimension.product_key IN (SELECT qty_in_stock from inventory_fact);
查询计划
QUERY PLAN DESCRIPTION:
------------------------------
explain SELECT product_description FROM product_dimension WHERE product_dimension.product_key IN (SELECT qty_in_stock from inventory_fact);
Access Path:
+-JOIN HASH [Semi] [Cost: 1K, Rows: 30K] (PATH ID: 1) Outer (FILTER) Inner (RESEGMENT)
| Join Cond: (product_dimension.product_key = VAL(2))
| Materialize at Output: product_dimension.product_description
| Execute on: All Nodes
| +-- Outer -> STORAGE ACCESS for product_dimension [Cost: 152, Rows: 60K] (PATH ID: 2)
| | Projection: public.product_dimension
| | Materialize: product_dimension.product_key
| | Execute on: All Nodes
| | Runtime Filter: (SIP1(HashJoin): product_dimension.product_key)
| +-- Inner -> SELECT [Cost: 248, Rows: 300K] (PATH ID: 3)
| | Execute on: All Nodes
| | +---> STORAGE ACCESS for inventory_fact [Cost: 248, Rows: 300K] (PATH ID: 4)
| | | Projection: public.inventory_fact_b0
| | | Materialize: inventory_fact.qty_in_stock
| | | Execute on: All Nodes
优化器生成的带注释查询
SELECT /*+ syntactic_join */ product_dimension.product_description AS product_description
FROM (public.product_dimension AS product_dimension/*+projs('public.product_dimension')*/
SEMI JOIN /*+Distrib(F,R),JType(H)*/ (SELECT inventory_fact.qty_in_stock AS qty_in_stock
FROM public.inventory_fact AS inventory_fact/*+projs('public.inventory_fact')*/) AS subQ_1
ON (product_dimension.product_key = subQ_1.qty_in_stock))
NULLAWARE ANTI JOIN
重新创建用来执行 null-aware anti-join 的查询,其中包含以
NOT IN
或 !=ALL
运算符开头的子查询。
输入查询
SELECT product_description FROM product_dimension
WHERE product_dimension.product_key NOT IN (SELECT qty_in_stock from inventory_fact);
查询计划
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT product_description FROM product_dimension WHERE product_dimension.product_key not IN (SELECT qty_in_sto
ck from inventory_fact);
Access Path:
+-JOIN HASH [Anti][NotInAnti] [Cost: 7K, Rows: 30K] (PATH ID: 1) Inner (BROADCAST)
| Join Cond: (product_dimension.product_key = VAL(2))
| Materialize at Output: product_dimension.product_description
| Execute on: Query Initiator
| +-- Outer -> STORAGE ACCESS for product_dimension [Cost: 152, Rows: 60K] (PATH ID: 2)
| | Projection: public.product_dimension_DBD_2_rep_VMartDesign
| | Materialize: product_dimension.product_key
| | Execute on: Query Initiator
| +-- Inner -> SELECT [Cost: 248, Rows: 300K] (PATH ID: 3)
| | Execute on: All Nodes
| | +---> STORAGE ACCESS for inventory_fact [Cost: 248, Rows: 300K] (PATH ID: 4)
| | | Projection: public.inventory_fact_DBD_9_seg_VMartDesign_b0
| | | Materialize: inventory_fact.qty_in_stock
| | | Execute on: All Nodes
优化器生成的带注释查询
SELECT /*+ syntactic_join */ product_dimension.product_description AS product_description
FROM (public.product_dimension AS product_dimension/*+projs('public.product_dimension')*/
NULLAWARE ANTI JOIN /*+Distrib(L,B),JType(H)*/ (SELECT inventory_fact.qty_in_stock AS qty_in_stock
FROM public.inventory_fact AS inventory_fact/*+projs('public.inventory_fact')*/) AS subQ_1
ON (product_dimension.product_key = subQ_1.qty_in_stock))
SEMIALL JOIN
重新创建用来执行 semi-all join 的查询,其中包含以
ALL
运算符开头的子查询。
输入查询
SELECT product_key, product_description FROM product_dimension
WHERE product_dimension.product_key > ALL (SELECT product_key from inventory_fact);
查询计划
QUERY PLAN DESCRIPTION:
------------------------------
explain SELECT product_key, product_description FROM product_dimension WHERE product_dimension.product_key > ALL (SELECT product_key from inventory_fact);
Access Path:
+-JOIN HASH [Semi][All] [Cost: 7M, Rows: 30K] (PATH ID: 1) Outer (FILTER) Inner (BROADCAST)
| Join Filter: (product_dimension.product_key > VAL(2))
| Materialize at Output: product_dimension.product_description
| Execute on: All Nodes
| +-- Outer -> STORAGE ACCESS for product_dimension [Cost: 152, Rows: 60K] (PATH ID: 2)
| | Projection: public.product_dimension
| | Materialize: product_dimension.product_key
| | Execute on: All Nodes
| +-- Inner -> SELECT [Cost: 248, Rows: 300K] (PATH ID: 3)
| | Execute on: All Nodes
| | +---> STORAGE ACCESS for inventory_fact [Cost: 248, Rows: 300K] (PATH ID: 4)
| | | Projection: public.inventory_fact_b0
| | | Materialize: inventory_fact.product_key
| | | Execute on: All Nodes
优化器生成的带注释查询
SELECT /*+ syntactic_join */ product_dimension.product_key AS product_key, product_dimension.product_description AS product_description
FROM (public.product_dimension AS product_dimension/*+projs('public.product_dimension')*/
SEMIALL JOIN /*+Distrib(F,B),JType(H)*/ (SELECT inventory_fact.product_key AS product_key FROM public.inventory_fact AS inventory_fact/*+projs('public.inventory_fact')*/) AS subQ_1
ON (product_dimension.product_key > subQ_1.product_key))
ANTI JOIN
重新创建用来执行 anti-join 的查询,其中包含以
NOT EXISTS
运算符开头的子查询。
输入查询
SELECT product_key, product_description FROM product_dimension
WHERE NOT EXISTS (SELECT inventory_fact.product_key FROM inventory_fact
WHERE inventory_fact.product_key=product_dimension.product_key);
查询计划
QUERY PLAN DESCRIPTION:
------------------------------
explain SELECT product_key, product_description FROM product_dimension WHERE NOT EXISTS (SELECT inventory_fact.product_
key FROM inventory_fact WHERE inventory_fact.product_key=product_dimension.product_key);
Access Path:
+-JOIN HASH [Anti] [Cost: 703, Rows: 30K] (PATH ID: 1) Outer (FILTER)
| Join Cond: (VAL(1) = product_dimension.product_key)
| Materialize at Output: product_dimension.product_description
| Execute on: All Nodes
| +-- Outer -> STORAGE ACCESS for product_dimension [Cost: 152, Rows: 60K] (PATH ID: 2)
| | Projection: public.product_dimension_DBD_2_rep_VMartDesign
| | Materialize: product_dimension.product_key
| | Execute on: All Nodes
| +-- Inner -> SELECT [Cost: 248, Rows: 300K] (PATH ID: 3)
| | Execute on: All Nodes
| | +---> STORAGE ACCESS for inventory_fact [Cost: 248, Rows: 300K] (PATH ID: 4)
| | | Projection: public.inventory_fact_DBD_9_seg_VMartDesign_b0
| | | Materialize: inventory_fact.product_key
| | | Execute on: All Nodes
优化器生成的带注释查询
SELECT /*+ syntactic_join */ product_dimension.product_key AS product_key, product_dimension.product_description AS product_description
FROM (public.product_dimension AS product_dimension/*+projs('public.product_dimension')*/
ANTI JOIN /*+Distrib(F,L),JType(H)*/ (SELECT inventory_fact.product_key AS "inventory_fact.product_key"
FROM public.inventory_fact AS inventory_fact/*+projs('public.inventory_fact')*/) AS subQ_1
ON (subQ_1."inventory_fact.product_key" = product_dimension.product_key))
7.2 - 复杂联接语义
优化器使用一组关键字来表达复杂交叉联接和半联接。所有复杂联接都由关键字 COMPLEX
指示,该关键字插入在关键字 JOIN
之前,例如 CROSS COMPLEX JOIN
。复杂半联接的语义有一个额外的要求,在下面有详细说明。
复杂交叉联接
Vertica 使用关键字短语 CROSS COMPLEX JOIN
来描述所有的复杂交叉联接。例如:
输入查询
SELECT
(SELECT max(sales_quantity) FROM store.store_sales_fact) *
(SELECT max(sales_quantity) FROM online_sales.online_sales_fact);
查询计划
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT
(SELECT max(sales_quantity) FROM store.store_sales_fact) *
(SELECT max(sales_quantity) FROM online_sales.online_sales_fact);
Access Path:
+-JOIN (CROSS JOIN) [Cost: 4K, Rows: 1 (NO STATISTICS)] (PATH ID: 1)
| Execute on: Query Initiator
| +-- Outer -> JOIN (CROSS JOIN) [Cost: 2K, Rows: 1 (NO STATISTICS)] (PATH ID: 2)
| | Execute on: Query Initiator
| | +-- Outer -> STORAGE ACCESS for dual [Cost: 10, Rows: 1] (PATH ID: 3)
| | | Projection: v_catalog.dual_p
| | | Materialize: dual.dummy
| | | Execute on: Query Initiator
| | +-- Inner -> SELECT [Cost: 2K, Rows: 1 (NO STATISTICS)] (PATH ID: 4)
| | | Execute on: Query Initiator
| | | +---> GROUPBY NOTHING [Cost: 2K, Rows: 1 (NO STATISTICS)] (PATH ID: 5)
| | | | Aggregates: max(store_sales_fact.sales_quantity)
| | | | Execute on: All Nodes
| | | | +---> STORAGE ACCESS for store_sales_fact [Cost: 1K, Rows: 5M (NO STATISTICS)] (PATH ID: 6)
| | | | | Projection: store.store_sales_fact_super
| | | | | Materialize: store_sales_fact.sales_quantity
| | | | | Execute on: All Nodes
| +-- Inner -> SELECT [Cost: 2K, Rows: 1 (NO STATISTICS)] (PATH ID: 7)
| | Execute on: Query Initiator
| | +---> GROUPBY NOTHING [Cost: 2K, Rows: 1 (NO STATISTICS)] (PATH ID: 8)
| | | Aggregates: max(online_sales_fact.sales_quantity)
| | | Execute on: All Nodes
| | | +---> STORAGE ACCESS for online_sales_fact [Cost: 1K, Rows: 5M (NO STATISTICS)] (PATH ID: 9)
| | | | Projection: online_sales.online_sales_fact_super
| | | | Materialize: online_sales_fact.sales_quantity
| | | | Execute on: All Nodes
优化器生成的带注释查询
以下带注释的查询返回与前面显示的输入查询相同的结果。与优化器生成的所有带注释查询一样,您可以直接在 vsql 中执行此查询(无论是按照编写还是修改后):
SELECT /*+syntactic_join,verbatim*/ (subQ_1.max * subQ_2.max) AS "?column?"
FROM ((v_catalog.dual AS dual CROSS COMPLEX JOIN /*+Distrib(L,L),JType(H)*/
(SELECT max(store_sales_fact.sales_quantity) AS max
FROM store.store_sales_fact AS store_sales_fact/*+projs('store.store_sales_fact')*/) AS subQ_1 )
CROSS COMPLEX JOIN /*+Distrib(L,L),JType(H)*/ (SELECT max(online_sales_fact.sales_quantity) AS max
FROM online_sales.online_sales_fact AS online_sales_fact/*+projs('online_sales.online_sales_fact')*/) AS subQ_2 )
复杂半联接
复杂半联接由以下关键字之一表达:
额外要求适用于所有复杂半联接:每个子查询的 SELECT
列表都以调用 Vertica 元函数 complex_join_marker()
的虚拟列(标记为 false
)结尾。当子查询处理每一行时,complex_join_marker()
返回 true
或 false
以指示结果集中包含或排除该行。结果集与这个标志一起返回到外部查询,外部查询可以使用来自这个和其他子查询中的标志来筛选它自己的结果集。
例如,查询优化器将以下输入查询重写为 NULLAWARE ANTI COMPLEX JOIN
。该联接返回子查询中其 complex_join_marker()
标志设置为适当布尔值的所有行。
输入查询
SELECT product_dimension.product_description FROM public.product_dimension
WHERE (NOT (product_dimension.product_key NOT IN (SELECT inventory_fact.qty_in_stock FROM public.inventory_fact)));
查询计划
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT product_dimension.product_description FROM public.product_dimension
WHERE (NOT (product_dimension.product_key NOT IN (SELECT inventory_fact.qty_in_stock FROM public.inventory_fact)));
Access Path:
+-JOIN HASH [Anti][NotInAnti] [Cost: 3K, Rows: 30K] (PATH ID: 1) Inner (BROADCAST)
| Join Cond: (product_dimension.product_key = VAL(2))
| Materialize at Output: product_dimension.product_description
| Filter: (NOT VAL(2))
| Execute on: All Nodes
| +-- Outer -> STORAGE ACCESS for product_dimension [Cost: 56, Rows: 60K] (PATH ID: 2)
| | Projection: public.product_dimension_super
| | Materialize: product_dimension.product_key
| | Execute on: All Nodes
| +-- Inner -> SELECT [Cost: 248, Rows: 300K] (PATH ID: 3)
| | Execute on: All Nodes
| | +---> STORAGE ACCESS for inventory_fact [Cost: 248, Rows: 300K] (PATH ID: 4)
| | | Projection: public.inventory_fact_super
| | | Materialize: inventory_fact.qty_in_stock
| | | Execute on: All Nodes
优化器生成的带注释查询
以下带注释的查询返回与前面显示的输入查询相同的结果。与优化器生成的所有带注释查询一样,您可以直接在 vsql 中执行此查询(无论是按照编写还是修改后)。例如,您可以通过修改外部查询的谓词对 subQ_1."false"
标志的求值方式来控制外部查询的输出。
SELECT /*+syntactic_join,verbatim*/ product_dimension.product_description AS product_description
FROM (public.product_dimension AS product_dimension/*+projs('public.product_dimension')*/
NULLAWARE ANTI COMPLEX JOIN /*+Distrib(L,B),JType(H)*/
(SELECT inventory_fact.qty_in_stock AS qty_in_stock, complex_join_marker() AS "false"
FROM public.inventory_fact AS inventory_fact/*+projs('public.inventory_fact')*/) AS subQ_1
ON (product_dimension.product_key = subQ_1.qty_in_stock)) WHERE (NOT subQ_1."false")
8 - 限制
定向查询支持各种查询,但也有一些例外。Vertica 通过优化器生成的警告处理所有例外。以下部分将这些限制分为几类。
表和投影
存在以下限制:
函数
不支持包括以下函数的查询:
运算符和子句
不支持包括以下内容的查询:
数据类型
不支持包括以下数据类型的查询: