There are several ways to optimize an INSERT-SELECT query that has the following format:
INSERT /*+direct*/ INTO destination SELECT * FROM source;
This is the multi-page printable view of this section. Click here to print.
There are several ways to optimize an INSERT-SELECT query that has the following format:
INSERT /*+direct*/ INTO destination SELECT * FROM source;
When performing INSERT
-SELECT
operations, to avoid the sort phase of the INSERT
, make sure that the sort order for the SELECT
query matches the projection sort order of the target table.
For example, on a single-node database:
=> CREATE TABLE source (col1 INT, col2 INT, col3 INT);
=> CREATE PROJECTION source_p (col1, col2, col3)
AS SELECT col1, col2, col3 FROM source
ORDER BY col1, col2, col3
SEGMENTED BY HASH(col3)
ALL NODES;
=> CREATE TABLE destination (col1 INT, col2 INT, col3 INT);
=> CREATE PROJECTION destination_p (col1, col2, col3)
AS SELECT col1, col2, col3 FROM destination
ORDER BY col1, col2, col3
SEGMENTED BY HASH(col3)
ALL NODES;
The following INSERT
does not require a sort because the query result has the column order of the projection:
=> INSERT /*+direct*/ INTO destination SELECT * FROM source;
The following INSERT
requires a sort because the order of the columns in the SELECT
statement does not match the projection order:
=> INSERT /*+direct*/ INTO destination SELECT col1, col3, col2 FROM source;
The following INSERT
does not require a sort. The order of the columns doesn't match, but the explicit ORDER BY causes the output to be sorted by c1
, c3
, c2
in Vertica:
=> INSERT /*+direct*/ INTO destination SELECT col1, col3, col2 FROM source
GROUP BY col1, col3, col2
ORDER BY col1, col2, col3 ;
When performing an INSERT-SELECT
operation from a segmented source table to a segmented destination table, segment both projections on the same column to avoid resegmenting the data, as in the following example:
CREATE TABLE source (col1 INT, col2 INT, col3 INT);
CREATE PROJECTION source_p (col1, col2, col3) AS
SELECT col1, col2, col3 FROM source
SEGMENTED BY HASH(col3) ALL NODES;
CREATE TABLE destination (col1 INT, col2 INT, col3 INT);
CREATE PROJECTION destination_p (col1, col2, col3) AS
SELECT col1, col2, col3 FROM destination
SEGMENTED BY HASH(col3) ALL NODES;
INSERT /*+direct*/ INTO destination SELECT * FROM source;