Ignoring constants in directed queries
Optimizer-generated directed queries generally include one or more
:v
(alias of IGNORECONSTANT
) hints, which mark predicate string constants that you want the optimizer to ignore when it decides whether to use a directed query for a given input query. :v
hints enable multiple queries to use the same directed query, provided the queries are identical in all respects except their predicate strings.
For example, the following two queries are identical , except for the string constants Boston|San Francisco
and Cashier|Branch Manager
, which are specified for columns employee_city
and job_title
, respectively:
=> SELECT employee_first_name, employee_last_name FROM public.employee_dimension
WHERE employee_city='Boston' and job_title ='Cashier' ORDER BY employee_last_name, employee_first_name;
=> SELECT employee_first_name, employee_last_name FROM public.employee_dimension
WHERE employee_city = 'San Francisco' and job_title = 'Branch Manager' ORDER BY employee_last_name, employee_first_name;
In this case, an optimizer-generated directed query that you create from one query can be used for both:
=> CREATE DIRECTED QUERY OPTIMIZER 'findEmployeesCityJobTitle_OPT'
SELECT employee_first_name, employee_last_name FROM public.employee_dimension
WHERE employee_city='Boston' and job_title='Cashier' ORDER BY employee_last_name, employee_first_name;
CREATE DIRECTED QUERY
=> ACTIVATE DIRECTED QUERY findEmployeesCityJobTitle_OPT;
ACTIVATE DIRECTED QUERY
The directed query's input and annotated queries both include :v
hints:
=> SELECT input_query, annotated_query FROM V_CATALOG.DIRECTED_QUERIES WHERE query_name = 'findEmployeesCityJobTitle_OPT';
-[ RECORD 1 ]---+----------------------------------------------------------------------------
input_query | SELECT employee_dimension.employee_first_name, employee_dimension.employee_last_name FROM public.employee_dimension
WHERE ((employee_dimension.employee_city = 'Boston'::varchar(6) /*+:v(1)*/) AND (employee_dimension.job_title = 'Cashier'::varchar(7) /*+:v(2)*/))
ORDER BY employee_dimension.employee_last_name, employee_dimension.employee_first_name
annotated_query | SELECT /*+verbatim*/ employee_dimension.employee_first_name AS employee_first_name, employee_dimension.employee_last_name AS employee_last_name FROM public.employee_dimension AS employee_dimension/*+projs('public.employee_dimension')*/
WHERE (employee_dimension.employee_city = 'Boston'::varchar(6) /*+:v(1)*/) AND (employee_dimension.job_title = 'Cashier'::varchar(7) /*+:v(2)*/)
ORDER BY 2 ASC, 1 ASC
The hint arguments in the input and annotated queries pair two predicate constants:
-
/*+:v(1)*/
pairs input and annotated query settings foremployee_city
. -
/*+:v(2)*/
pairs input and annotated query settings forjob_title
.
The :v
hints tell the optimizer to ignore values for these two columns when it decides whether it can use this directed query for a given input query.
For example, the following query uses different values for employee_city
and job_title
, but is otherwise identical to the query used to create the directed query EmployeesCityJobTitle_OPT
:
=> SELECT employee_first_name, employee_last_name FROM public.employee_dimension
WHERE employee_city = 'San Francisco' and job_title = 'Branch Manager' ORDER BY employee_last_name, employee_first_name;
If the directed query EmployeesCityJobTitle_OPT
is active, the optimizer can use it in its query plan for this query:
=> EXPLAIN SELECT employee_first_name, employee_last_name FROM employee_dimension WHERE employee_city='San Francisco' AND job_title='Branch Manager' ORDER BY employee_last_name, employee_first_name;
...
------------------------------
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT employee_first_name, employee_last_name FROM employee_dimension WHERE employee_city='San Francisco' AND job_title='Branch Manager' ORDER BY employee_last_name, employee_first_name;
The following active directed query(query name: findEmployeesCityJobTitle_OPT) is being executed:
SELECT /*+verbatim*/ employee_dimension.employee_first_name, employee_dimension.employee_last_name FROM public.employee_dimension employee_dimension/*+projs('public.employee_dimension')*/
WHERE ((employee_dimension.employee_city = 'San Francisco'::varchar(13)) AND (employee_dimension.job_title = 'Branch Manager'::varchar(14)))
ORDER BY employee_dimension.employee_last_name, employee_dimension.employee_first_name
Access Path:
+-SORT [Cost: 222, Rows: 10K (NO STATISTICS)] (PATH ID: 1)
| Order: employee_dimension.employee_last_name ASC, employee_dimension.employee_first_name ASC
| Execute on: All Nodes
| +---> STORAGE ACCESS for employee_dimension [Cost: 60, Rows: 10K (NO STATISTICS)] (PATH ID: 2)
| | Projection: public.employee_dimension_super
| | Materialize: employee_dimension.employee_first_name, employee_dimension.employee_last_name
| | Filter: (employee_dimension.employee_city = 'San Francisco')
| | Filter: (employee_dimension.job_title = 'Branch Manager')
| | Execute on: All Nodes
...
Mapping one-to-many :v hints
The examples shown so far demonstrate one-to-one pairings of :v
hints. You can also use :v
hints to map one input constant to multiple constants in an annotated query. This approach can be especially useful when you want to provide the optimizer with explicit instructions how to execute a query that joins tables.
For example, the following query joins two tables:
SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 8;
In this case, the optimizer can infer that S.a
and T.b
have the same value and implements the join accordingly:
<a name="simpleJoinExample"></a>=> CREATE DIRECTED QUERY OPTIMIZER simpleJoin SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 8;
CREATE DIRECTED QUERY
=> SELECT input_query, annotated_query FROM directed_queries WHERE query_name = 'simpleJoin';
-[ RECORD 1 ]---+---------------------------------------------------------------------------------------------------------------------
input_query | SELECT S.a, T.b FROM (public.S JOIN public.T ON ((S.a = T.b))) WHERE (S.a = 8 /*+:v(1)*/)
annotated_query | SELECT /*+syntactic_join,verbatim*/ S.a AS a, T.b AS b
FROM (public.S AS S/*+projs('public.S')*/ JOIN /*+Distrib(L,L),JType(M)*/ public.T AS T/*+projs('public.T')*/ ON (S.a = T.b))
WHERE (S.a = 8 /*+:v(1)*/) AND (T.b = 8 /*+:v(1)*/)
(1 row)
=> ACTIVATE DIRECTED QUERY simpleJoin;
ACTIVATED DIRECTED QUERY
Now, given the following input query:
SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 3;
the optimizer disregards the join predicate constants and uses the directed query simpleJoin
in its query plan:
------------------------------
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 3;
The following active directed query(query name: simpleJoin) is being executed:
SELECT /*+syntactic_join,verbatim*/ S.a, T.b FROM (public.S S/*+projs('public.S')*/ JOIN /*+Distrib('L', 'L'), JType('
M')*/public.T T/*+projs('public.T')*/ ON ((S.a = T.b))) WHERE ((S.a = 3) AND (T.b = 3))
Access Path:
+-JOIN MERGEJOIN(inputs presorted) [Cost: 21, Rows: 4 (NO STATISTICS)] (PATH ID: 1)
| Join Cond: (S.a = T.b)
| Execute on: Query Initiator
| +-- Outer -> STORAGE ACCESS for S [Cost: 12, Rows: 4 (NO STATISTICS)] (PATH ID: 2)
| | Projection: public.S_b0
| | Materialize: S.a
| | Filter: (S.a = 3)
| | Execute on: Query Initiator
| | Runtime Filter: (SIP1(MergeJoin): S.a)
| +-- Inner -> STORAGE ACCESS for T [Cost: 8, Rows: 3 (NO STATISTICS)] (PATH ID: 3)
| | Projection: public.T_b0
| | Materialize: T.b
| | Filter: (T.b = 3)
| | Execute on: Query Initiator
...
Conserving predicate constants in directed queries
By default, optimizer-generated directed queries set :v
hints on predicate constants. You can override this behavior by marking predicate constants that must not be ignored with
:c
hints. For example, the following statement creates a directed query that can be used only for input queries where the join predicate constant is the same as in the original input query—8
:
=> CREATE DIRECTED QUERY OPTIMIZER simpleJoin_KeepPredicateConstant SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 8 /*+:c*/;
CREATE DIRECTED QUERY
=> ACTIVATE DIRECTED QUERY simpleJoin_KeepPredicateConstant;
The following query on system table DIRECTED_QUERIES compares directed queries simpleJoin
(created in an earlier example) and simpleJoin_KeepPredicateConstant
. Unlike simpleJoin
, the join predicate of the input and annotated queries for simpleJoin_KeepPredicateConstant
omit :v
hints:
=> SELECT query_name, input_query, annotated_query FROM directed_queries WHERE query_name ILIKE'%simpleJoin%';
-[ RECORD 1 ]---+
query_name | simpleJoin
input_query | SELECT S.a, T.b FROM (public.S JOIN public.T ON ((S.a = T.b))) WHERE (S.a = 8 /*+:v(1)*/)
annotated_query | SELECT /*+syntactic_join,verbatim*/ S.a AS a, T.b AS b
FROM (public.S AS S/*+projs('public.S')*/ JOIN /*+Distrib(L,L),JType(M)*/ public.T AS T/*+projs('public.T')*/ ON (S.a = T.b))
WHERE (S.a = 8 /*+:v(1)*/) AND (T.b = 8 /*+:v(1)*/)
-[ RECORD 2 ]---+
query_name | simpleJoin_KeepPredicateConstant
input_query | SELECT S.a, T.b FROM (public.S JOIN public.T ON ((S.a = T.b))) WHERE (S.a = 8)
annotated_query | SELECT /*+syntactic_join,verbatim*/ S.a AS a, T.b AS b
FROM (public.S AS S/*+projs('public.S')*/ JOIN /*+Distrib(L,L),JType(M)*/ public.T AS T/*+projs('public.T')*/ ON (S.a = T.b))
WHERE (S.a = 8) AND (T.b = 8)
If you deactivate directed query simpleJoin
(which would otherwise have precedence over simpleJoin_KeepPredicateConstant
because it was created earlier), Vertica only applies simpleJoin_KeepPredicateConstant
on input queries where the join predicate constant is the same as in the original input query. For example, compare the following two query plans:
=> EXPLAIN SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 8;
...
------------------------------
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 8;
The following active directed query(query name: simpleJoin_KeepPredicateConstant) is being executed:
SELECT /*+syntactic_join,verbatim*/ S.a, T.b FROM (public.S S/*+projs('public.S')*/ JOIN /*+Distrib('L', 'L'), JType('
M')*/public.T T/*+projs('public.T')*/ ON ((S.a = T.b))) WHERE ((S.a = 8) AND (T.b = 8))
Access Path:
+-JOIN MERGEJOIN(inputs presorted) [Cost: 21, Rows: 4 (NO STATISTICS)] (PATH ID: 1)
| Join Cond: (S.a = T.b)
...
=> EXPLAIN SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 3
...
------------------------------
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 3;
Access Path:
+-JOIN MERGEJOIN(inputs presorted) [Cost: 21, Rows: 4 (NO STATISTICS)] (PATH ID: 1)
| Join Cond: (S.a = T.b)
...