Window partitioning

Optionally specified in a function's OVER clause, a partition (PARTITION BY) clause groups input rows before the function processes them.

Optionally specified in a function's OVER clause, a partition (PARTITION BY) clause groups input rows before the function processes them. Window partitioning using PARTITION NODES or PARTITION BEST is similar to an aggregate function's GROUP BY clause, except it returns exactly one result row per input row. Window partitioning using PARTITION ROW allows you to provide single-row partitions of input, allowing you to use window partitioning on 1:N transform functions. If you omit the window partition clause, the function treats all input rows as a single partition.

Specifying window partitioning

You specify window partitioning in the function's OVER clause, as follows:

{ PARTITION BY expression[,...] | PARTITION BEST | PARTITION NODES | PARTITION ROW }

For syntax details, see Window partition clause.

Examples

The examples in this topic use the allsales schema defined in Invoking analytic functions.

CREATE TABLE allsales(state VARCHAR(20), name VARCHAR(20), sales INT);
INSERT INTO allsales VALUES('MA', 'A', 60);
INSERT INTO allsales VALUES('NY', 'B', 20);
INSERT INTO allsales VALUES('NY', 'C', 15);
INSERT INTO allsales VALUES('MA', 'D', 20);
INSERT INTO allsales VALUES('MA', 'E', 50);
INSERT INTO allsales VALUES('NY', 'F', 40);
INSERT INTO allsales VALUES('MA', 'G', 10);
COMMIT;

The following query calculates the median of sales within each state. The analytic function is computed per partition and starts over again at the beginning of the next partition.

=> SELECT state, name, sales, MEDIAN(sales)
      OVER (PARTITION BY state) AS median from allsales;

The results are grouped into partitions for MA (35) and NY (20) under the median column.

 state | name | sales | median
-------+------+-------+--------
 NY    | C    |    15 |     20
 NY    | B    |    20 |     20
 NY    | F    |    40 |     20
-------------------------------
 MA    | G    |    10 |     35
 MA    | D    |    20 |     35
 MA    | E    |    50 |     35
 MA    | A    |    60 |     35
(7 rows)

The following query calculates the median of total sales among states. When you use OVER() with no parameters, there is one partition—the entire input:

=> SELECT state, sum(sales), median(SUM(sales))
      OVER () AS median FROM allsales GROUP BY state;
 state | sum | median
-------+-----+--------
 NY    |  75 |  107.5
 MA    | 140 |  107.5
(2 rows)

Sales larger than median (evaluation order)

Analytic functions are evaluated after all other clauses except the query's final SQL ORDER BY clause. So a query that asks for all rows with sales larger than the median returns an error because the WHERE clause is applied before the analytic function and column m does not yet exist:

=> SELECT name, sales,  MEDIAN(sales) OVER () AS m
   FROM allsales WHERE sales > m;
   ERROR 2624:  Column "m" does not exist

You can work around this by placing in a subquery the predicate WHERE sales > m:

=> SELECT * FROM
   (SELECT name, sales, MEDIAN(sales) OVER () AS m FROM allsales) sq
   WHERE sales > m;
 name | sales | m
------+-------+----
 F    |    40 | 20
 E    |    50 | 20
 A    |    60 | 20
(3 rows)

For more examples, see Analytic query examples.