Runtime sorting of NULL values in analytic functions
By carefully writing queries or creating your design (or both), you can help the Vertica query optimizer skip sorting all columns in a table when performing an analytic function, which can improve query performance.
By carefully writing queries or creating your design (or both), you can help the Vertica query optimizer skip sorting all columns in a table when performing an analytic function, which can improve query performance.
To minimize Vertica's need to sort projections during query execution, redefine the employee
table and specify that NULL values are not allowed in the sort fields:
=> DROP TABLE employee CASCADE;
=> CREATE TABLE employee
(empno INT,
deptno INT NOT NULL,
sal INT NOT NULL);
CREATE TABLE
=> CREATE PROJECTION employee_p AS
SELECT * FROM employee
ORDER BY deptno, sal;
CREATE PROJECTION
=> INSERT INTO employee VALUES(101,10,50000);
=> INSERT INTO employee VALUES(103,10,43000);
=> INSERT INTO employee VALUES(104,10,45000);
=> INSERT INTO employee VALUES(105,20,97000);
=> INSERT INTO employee VALUES(108,20,33000);
=> INSERT INTO employee VALUES(109,20,51000);
=> COMMIT;
COMMIT
=> SELECT * FROM employee;
empno | deptno | sal
-------+--------+-------
101 | 10 | 50000
103 | 10 | 43000
104 | 10 | 45000
105 | 20 | 97000
108 | 20 | 33000
109 | 20 | 51000
(6 rows)
=> SELECT deptno, sal, empno, RANK() OVER
(PARTITION BY deptno ORDER BY sal)
FROM employee;
deptno | sal | empno | ?column?
--------+-------+-------+----------
10 | 43000 | 103 | 1
10 | 45000 | 104 | 2
10 | 50000 | 101 | 3
20 | 33000 | 108 | 1
20 | 51000 | 109 | 2
20 | 97000 | 105 | 3
(6 rows)
Tip
If you do not care about NULL placement in queries that involve analytic computations, or if you know that columns contain no NULL values, specifyNULLS AUTO
in your queries.
Vertica attempts to choose the placement that gives the fastest performance. Otherwise, specify NULLS FIRST
or NULLS LAST
.