JOIN queries
In general, you can optimize execution of queries that join multiple tables in several ways:
-
Create projections for the joined tables that are sorted on join predicate columns. This facilitates use of the merge join algorithm, which generally joins tables more efficiently than the hash join algorithm.
-
Create projections that are identically segmented on the join keys.
Other best practices
Vertica also executes joins more efficiently if the following conditions are true:
-
Query construction enables the query optimizer to create a plan where the larger table is defined as the outer input.
-
The columns on each side of the equality predicate are from the same table. For example in the following query, the left and right sides of the equality predicate include only columns from tables T and X, respectively:
=> SELECT * FROM T JOIN X ON T.a + T.b = X.x1 - X.x2;
Conversely, the following query incurs more work to process, because the right side of the predicate includes columns from both tables T and X:
=> SELECT * FROM T JOIN X WHERE T.a = X.x1 + T.b