COUNT [analytic]
Counts occurrences within a group within a window. If you specify * or some non-null constant, COUNT()
counts all rows.
Behavior type
ImmutableSyntax
COUNT ( expression ) OVER (
[ window-partition-clause ]
[ window-order-clause ]
[ window-frame-clause ] )
Parameters
expression
- Returns the number of rows in each group for which the
expression
is not null. Can be any expression resulting in BIGINT. OVER()
- See Analytic Functions.
Examples
Using the schema defined in Window framing, the following COUNT
function omits window order and window frame clauses; otherwise Vertica would treat it as a window aggregate. Think of the window of reporting aggregates as UNBOUNDED PRECEDING
and UNBOUNDED FOLLOWING
.
=> SELECT deptno, sal, empno, COUNT(sal)
OVER (PARTITION BY deptno) AS count FROM emp;
deptno | sal | empno | count
--------+-----+-------+-------
10 | 101 | 1 | 2
10 | 104 | 4 | 2
20 | 110 | 10 | 6
20 | 110 | 9 | 6
20 | 109 | 7 | 6
20 | 109 | 6 | 6
20 | 109 | 8 | 6
20 | 109 | 11 | 6
30 | 105 | 5 | 3
30 | 103 | 3 | 3
30 | 102 | 2 | 3
Using ORDER BY sal
creates a moving window query with default window: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
.
=> SELECT deptno, sal, empno, COUNT(sal)
OVER (PARTITION BY deptno ORDER BY sal) AS count
FROM emp;
deptno | sal | empno | count
--------+-----+-------+-------
10 | 101 | 1 | 1
10 | 104 | 4 | 2
20 | 100 | 11 | 1
20 | 109 | 7 | 4
20 | 109 | 6 | 4
20 | 109 | 8 | 4
20 | 110 | 10 | 6
20 | 110 | 9 | 6
30 | 102 | 2 | 1
30 | 103 | 3 | 2
30 | 105 | 5 | 3
Using the VMart schema, the following query finds the number of employees who make less than or equivalent to the hourly rate of the current employee. The query returns a running/cumulative average (sometimes called a moving average) using the default window of RANGE UNBOUNDED PRECEDING AND CURRENT ROW
:
=> SELECT employee_last_name AS "last_name", hourly_rate, COUNT(*)
OVER (ORDER BY hourly_rate) AS moving_count from employee_dimension;
last_name | hourly_rate | moving_count
------------+-------------+--------------
Gauthier | 6 | 4
Taylor | 6 | 4
Jefferson | 6 | 4
Nielson | 6 | 4
McNulty | 6.01 | 11
Robinson | 6.01 | 11
Dobisz | 6.01 | 11
Williams | 6.01 | 11
Kramer | 6.01 | 11
Miller | 6.01 | 11
Wilson | 6.01 | 11
Vogel | 6.02 | 14
Moore | 6.02 | 14
Vogel | 6.02 | 14
Carcetti | 6.03 | 19
...
To return a moving average that is not also a running (cumulative) average, the window should specify ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING
:
=> SELECT employee_last_name AS "last_name", hourly_rate, COUNT(*)
OVER (ORDER BY hourly_rate ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)
AS moving_count from employee_dimension;