这是本节的多页打印视图。
点击此处打印.
返回本页常规视图.
创建定向查询
CREATE DIRECTED QUERY 将输入查询与使用优化器提示进行注释的查询相关联。它将此关联存储在一个唯一标识符下。CREATE DIRECTED QUERY 有两种变体:
在这两种情形中,Vertica 都会将带注释的查询与输入查询相关联,并将其关联注册在 query_name
下的系统表 DIRECTED_QUERIES 中。
这两种方法可以一起使用:您可以使用优化器创建的带注释 SQL 作为创建自己的(自定义)定向查询的基础。
1 - 优化器生成的定向查询
CREATE DIRECTED QUERY OPTIMIZER 将输入查询传递给优化器,优化器从自己的查询计划生成带注释的查询。然后,它将输入查询和带注释的查询配对并将其保存为定向查询。此定向查询可用于处理其他除了作为查询结果筛选依据的谓词字符串以外完全相同的查询。
可在升级之前使用优化器生成的定向查询来捕获查询计划。如果在升级后检测到给定查询的性能下降,这样做将会非常有用。在这种情况下,可使用相应的定向查询重新创建早期查询计划,并将其性能与当前优化器生成的计划相对比。
示例
以下 SQL 语句创建和激活定向查询 findEmployeesCityJobTitle_OPT
:
=> CREATE DIRECTED QUERY OPTIMIZER 'findEmployeesCityJobTitle_OPT'
SELECT employee_first_name, employee_last_name FROM public.employee_dimension
WHERE employee_city='Boston' and job_title='Cashier' ORDER BY employee_last_name, employee_first_name;
CREATE DIRECTED QUERY
=> ACTIVATE DIRECTED QUERY findEmployeesCityJobTitle_OPT;
ACTIVATE DIRECTED QUERY
在激活此定向查询计划后,优化器会使用它为此输入查询的所有后续调用以及其他类似调用生成查询计划。可通过调用 GET DIRECTED QUERY 或查询系统表 DIRECTED_QUERIES 来查看优化器生成的带注释查询:
=> SELECT input_query, annotated_query FROM V_CATALOG.DIRECTED_QUERIES WHERE query_name = 'findEmployeesCityJobTitle_OPT';
-[ RECORD 1 ]---+----------------------------------------------------------------------------
input_query | SELECT employee_dimension.employee_first_name, employee_dimension.employee_last_name FROM public.employee_dimension
WHERE ((employee_dimension.employee_city = 'Boston'::varchar(6) /*+:v(1)*/) AND (employee_dimension.job_title = 'Cashier'::varchar(7) /*+:v(2)*/))
ORDER BY employee_dimension.employee_last_name, employee_dimension.employee_first_name
annotated_query | SELECT /*+verbatim*/ employee_dimension.employee_first_name AS employee_first_name, employee_dimension.employee_last_name AS employee_last_name FROM public.employee_dimension AS employee_dimension/*+projs('public.employee_dimension')*/
WHERE (employee_dimension.employee_city = 'Boston'::varchar(6) /*+:v(1)*/) AND (employee_dimension.job_title = 'Cashier'::varchar(7) /*+:v(2)*/)
ORDER BY 2 ASC, 1 ASC
带注释的查询包括以下提示:
以下查询对 employee_city
和 job_title
列使用不同的值,但在其他方面与定向查询 EmployeesCityJobTitle_OPT
的原始输入查询相同:
=> SELECT employee_first_name, employee_last_name FROM public.employee_dimension
WHERE employee_city = 'San Francisco' and job_title = 'Branch Manager' ORDER BY employee_last_name, employee_first_name;
如果定向查询 EmployeesCityJobTitle_OPT
处于活动状态,则优化器可以将其用于此查询:
=> EXPLAIN SELECT employee_first_name, employee_last_name FROM employee_dimension WHERE employee_city='San Francisco' AND job_title='Branch Manager' ORDER BY employee_last_name, employee_first_name;
...
------------------------------
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT employee_first_name, employee_last_name FROM employee_dimension WHERE employee_city='San Francisco' AND job_title='Branch Manager' ORDER BY employee_last_name, employee_first_name;
The following active directed query(query name: findEmployeesCityJobTitle_OPT) is being executed:
SELECT /*+verbatim*/ employee_dimension.employee_first_name, employee_dimension.employee_last_name FROM public.employee_dimension employee_dimension/*+projs('public.employee_dimension')*/
WHERE ((employee_dimension.employee_city = 'San Francisco'::varchar(13)) AND (employee_dimension.job_title = 'Branch Manager'::varchar(14)))
ORDER BY employee_dimension.employee_last_name, employee_dimension.employee_first_name
Access Path:
+-SORT [Cost: 222, Rows: 10K (NO STATISTICS)] (PATH ID: 1)
| Order: employee_dimension.employee_last_name ASC, employee_dimension.employee_first_name ASC
| Execute on: All Nodes
| +---> STORAGE ACCESS for employee_dimension [Cost: 60, Rows: 10K (NO STATISTICS)] (PATH ID: 2)
| | Projection: public.employee_dimension_super
| | Materialize: employee_dimension.employee_first_name, employee_dimension.employee_last_name
| | Filter: (employee_dimension.employee_city = 'San Francisco')
| | Filter: (employee_dimension.job_title = 'Branch Manager')
| | Execute on: All Nodes
...
2 - 自定义定向查询
CREATE DIRECTED QUERY CUSTOM 指定带注释的查询并将其与 SAVE QUERY 先前保存的输入查询配对。必须在同一用户会话中发出这两个语句。
例如,您可能希望某查询使用特定投影:
-
使用 SAVE QUERY 指定查询:
=> SAVE QUERY SELECT employee_first_name, employee_last_name FROM employee_dimension
WHERE employee_city='Boston' AND job_title='Cashier';
SAVE QUERY
注意
您提供给 SAVE QUERY 的输入查询仅支持
:v
提示。
-
使用 CREATE DIRECTED QUERY CUSTOM 创建自定义定向查询,它指定带注释的查询并将其与保存的查询相关联。带注释的查询包含一个 /*+projs*/
提示,它指示优化器在用户调用保存的查询时使用投影 public.emp_dimension_unseg
:
=> CREATE DIRECTED QUERY CUSTOM 'findBostonCashiers_CUSTOM'
SELECT employee_first_name, employee_last_name
FROM employee_dimension /*+Projs('public.emp_dimension_unseg')*/
WHERE employee_city='Boston' AND job_title='Cashier';
CREATE DIRECTED QUERY
当心
Vertica 将保存的查询与带注释的查询相关联,而不检查输入查询和带注释的查询是否兼容。请小心安排 SAVE QUERY 和 CREATE DIRECTED QUERY CUSTOM 语句的顺序,以使保存的查询和定向查询正确匹配。
-
激活此定向查询:
=> ACTIVATE DIRECTED QUERY findBostonCashiers_CUSTOM;
ACTIVATE DIRECTED QUERY
-
激活后,优化器将使用此定向查询为其输入查询的所有后续调用生成查询计划。下面的 EXPLAIN 输出验证优化器使用此定向查询的情况及其指定的投影:
=> EXPLAIN SELECT employee_first_name, employee_last_name FROM employee_dimension
WHERE employee_city='Boston' AND job_title='Cashier';
QUERY PLAN
------------------------------
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT employee_first_name, employee_last_name FROM employee_dimension where employee_city='Boston' AND job_title='Cashier';
The following active directed query(query name: findBostonCashiers_CUSTOM) is being executed:
SELECT employee_dimension.employee_first_name, employee_dimension.employee_last_name
FROM public.employee_dimension/*+Projs('public.emp_dimension_unseg')*/
WHERE ((employee_dimension.employee_city = 'Boston'::varchar(6)) AND (employee_dimension.job_title = 'Cashier'::varchar(7)))
Access Path:
+-STORAGE ACCESS for employee_dimension [Cost: 158, Rows: 10K (NO STATISTICS)] (PATH ID: 1)
| Projection: public.emp_dimension_unseg
| Materialize: employee_dimension.employee_first_name, employee_dimension.employee_last_name
| Filter: (employee_dimension.employee_city = 'Boston')
| Filter: (employee_dimension.job_title = 'Cashier')
| Execute on: Query Initiator
另请参阅
重写联接查询
3 - 结合使用优化器生成的定向查询和自定义定向查询
可以根据优化器所创建的带注释 SQL 来创建自己的自定义定向查询。当评估优化器创建用于处理给定查询的计划以及测试计划修改内容时,此方法特别有用。
例如,您可能希望修改优化器对以下查询的实施方式:
=> SELECT COUNT(customer_name) Total, customer_region Region
FROM (store_sales s JOIN customer_dimension c ON c.customer_key = s.customer_key)
JOIN product_dimension p ON s.product_key = p.product_key
WHERE p.category_description ilike '%Medical%'
AND p.product_description ilike '%antibiotics%'
AND c.customer_age <= 30 AND YEAR(s.sales_date)=2017
GROUP BY customer_region;
对此查询运行 EXPLAIN 时,您会发现优化器将投影 customers_proj_age
用于 customer_dimension
表。此投影按列 customer_age
进行排序。因此,优化器将根据 customer_key
对表 store_sales
和 customer_dimension
进行哈希联接。
分析 customer_dimension
表数据后,您观察到大部分客户的年龄不到 30 岁,因此对按 customer_key
排序的 customer_dimension
表使用投影 customer_proj_id
更有意义:
您可以创建一个用于封装此更改的定向查询,如下所示:
-
使用 EXPLAIN ANNOTATED 获取由优化器生成的查询注释:
=> \o annotatedQuery
=> EXPLAIN ANNOTATED SELECT COUNT(customer_name) Total, customer_region Region
FROM (store_sales s JOIN customer_dimension c ON c.customer_key = s.customer_key)
JOIN product_dimension p ON s.product_key = p.product_key
WHERE p.category_description ilike '%Medical%'
AND p.product_description ilike '%antibiotics%'
AND c.customer_age <= 30 AND YEAR(s.sales_date)=2017
GROUP BY customer_region;
=> \o
=> \! cat annotatedQuery
...
SELECT /*+syntactic_join,verbatim*/ count(c.customer_name) AS Total, c.customer_region AS Region
FROM ((public.store_sales AS s/*+projs('public.store_sales_super')*/
JOIN /*+Distrib(L,B),JType(H)*/ public.customer_dimension AS c/*+projs('public.customers_proj_age')*/
ON (c.customer_key = s.customer_key))
JOIN /*+Distrib(L,B),JType(M)*/ public.product_dimension AS p/*+projs('public.product_dimension')*/
ON (s.product_key = p.product_key))
WHERE ((date_part('year'::varchar(4), (s.sales_date)::timestamp(0)))::int = 2017)
AND (c.customer_age <= 30)
AND ((p.category_description)::varchar(32) ~~* '%Medical%'::varchar(9))
AND (p.product_description ~~* '%antibiotics%'::varchar(13))
GROUP BY /*+GByType(Hash)*/ 2
(4 rows)
-
修改带注释的查询:
SELECT /*+syntactic_join,verbatim*/ count(c.customer_name) AS Total, c.customer_region AS Region
FROM ((public.store_sales AS s/*+projs('public.store_sales_super')*/
JOIN /*+Distrib(L,B),JType(H)*/ public.customer_dimension AS c/*+projs('public.customer_proj_id')*/
ON (c.customer_key = s.customer_key))
JOIN /*+Distrib(L,B),JType(H)*/ public.product_dimension AS p/*+projs('public.product_dimension')*/
ON (s.product_key = p.product_key))
WHERE ((date_part('year'::varchar(4), (s.sales_date)::timestamp(0)))::int = 2017)
AND (c.customer_age <= 30)
AND ((p.category_description)::varchar(32) ~~* '%Medical%'::varchar(9))
AND (p.product_description ~~* '%antibiotics%'::varchar(13))
GROUP BY /*+GByType(Hash)*/ 2
-
使用修改后的带注释查询来创建所需的定向查询:
-
使用 SAVE QUERY 保存所需的输入查询:
=> SAVE QUERY SELECT COUNT(customer_name) Total, customer_region Region
FROM (store_sales s JOIN customer_dimension c ON c.customer_key = s.customer_key)
JOIN product_dimension p ON s.product_key = p.product_key
WHERE p.category_description ilike '%Medical%'
AND p.product_description ilike '%antibiotics%'
AND c.customer_age <= 30 AND YEAR(s.sales_date)=2017
GROUP BY customer_region;
-
创建一个自定义定向查询,将保存的输入查询与修改后的带注释查询相关联:
=> CREATE DIRECTED QUERY CUSTOM 'getCustomersUnder31'
SELECT /*+syntactic_join,verbatim*/ count(c.customer_name) AS Total, c.customer_region AS Region
FROM ((public.store_sales AS s/*+projs('public.store_sales_super')*/
JOIN /*+Distrib(L,B),JType(H)*/ public.customer_dimension AS c/*+projs('public.customer_proj_id')*/
ON (c.customer_key = s.customer_key))
JOIN /*+Distrib(L,B),JType(H)*/ public.product_dimension AS p/*+projs('public.product_dimension')*/
ON (s.product_key = p.product_key))
WHERE ((date_part('year'::varchar(4), (s.sales_date)::timestamp(0)))::int = 2017)
AND (c.customer_age <= 30)
AND ((p.category_description)::varchar(32) ~~* '%Medical%'::varchar(9))
AND (p.product_description ~~* '%antibiotics%'::varchar(13))
GROUP BY /*+GByType(Hash)*/ 2;
CREATE DIRECTED QUERY
-
激活此定向查询:
=> ACTIVATE DIRECTED QUERY getCustomersUnder31;
ACTIVATE DIRECTED QUERY
当优化器处理与此定向查询的输入查询相匹配的查询时,它会使用此定向查询的带注释查询来生成查询计划:
=> EXPLAIN SELECT COUNT(customer_name) Total, customer_region Region
FROM (store_sales s JOIN customer_dimension c ON c.customer_key = s.customer_key)
JOIN product_dimension p ON s.product_key = p.product_key
WHERE p.category_description ilike '%Medical%'
AND p.product_description ilike '%antibiotics%'
AND c.customer_age <= 30 AND YEAR(s.sales_date)=2017
GROUP BY customer_region;
The following active directed query(query name: getCustomersUnder31) is being executed:
...