NULL sort order
By default, projection column values are stored in ascending order, but placement of NULL values depends on a column's data type.
NULL placement differences with ORDER BY clauses
The analytic OVER(
window-order-clause
)
and the SQL ORDER BY
clause have slightly different semantics:
OVER(ORDER BY ...)
The analytic window order clause uses the ASC
or DESC
sort order to determine NULLS FIRST
or NULLS LAST
placement for analytic function results. NULL values are placed as follows:
-
ASC
,NULLS LAST
— NULL values appear at the end of the sorted result. -
DESC
,NULLS FIRST
— NULL values appear at the beginning of the sorted result.
(SQL) ORDER BY
The SQL and Vertica ORDER BY
clauses produce different results. The SQL ORDER BY
clause specifies only ascending or descending sort order. The Vertica ORDER BY
clause determines NULL placement based on the column data type:
-
NUMERIC, INTEGER, DATE, TIME, TIMESTAMP, and INTERVAL columns:
NULLS FIRST
(NULL values appear at the beginning of a sorted projection.) -
FLOAT, STRING, and BOOLEAN columns:
NULLS LAST
(NULL values appear at the end of a sorted projection.)
NULL sort options
If you do not care about NULL placement in queries that involve analytic computations, or if you know that columns do not contain any NULL values, specify NULLS AUTO
—irrespective of data type. Vertica chooses the placement that gives the fastest performance, as in the following query. Otherwise, specify NULLS FIRST
or NULLS LAST
.
=> SELECT x, RANK() OVER (ORDER BY x NULLS AUTO) FROM t;
You can carefully formulate queries so Vertica can avoid sorting the data and increase query performance, as illustrated by the following example. Vertica sorts inputs from table t
on column x
, as specified in the OVER(ORDER BY)
clause, and then evaluates RANK()
:
=> CREATE TABLE t (
x FLOAT,
y FLOAT );
=> CREATE PROJECTION t_p (x, y) AS SELECT * FROM t
ORDER BY x, y UNSEGMENTED ALL NODES;
=> SELECT x, RANK() OVER (ORDER BY x) FROM t;
In the preceding SELECT
statement, Vertica eliminates the ORDER BY
clause and executes the query quickly because column x
is a FLOAT data type. As a result, the projection sort order matches the analytic default ordering (ASC + NULLS LAST
). Vertica can also avoid having to sort the data when the underlying projection is already sorted.
However, if column x
is an INTEGER data type, Vertica must sort the data because the projection sort order for INTEGER data types (ASC + NULLS FIRST
) does not match the default analytic ordering (ASC + NULLS LAST
). To help Vertica eliminate the sort, specify the placement of NULLs to match the default ordering:
=> SELECT x, RANK() OVER (ORDER BY x NULLS FIRST) FROM t;
If column x
is a STRING, the following query eliminates the sort:
=> SELECT x, RANK() OVER (ORDER BY x NULLS LAST) FROM t;
If you omit NULLS LAST
in the preceding query, Ver eliminates the sort because ASC + NULLS LAST
is the default sort specification for both the analytic ORDER BY
clause and for string-related columns in Vertica.