APPROXIMATE_QUANTILES

Computes an array of weighted, approximate percentiles of a column within some user-specified error.

Computes an array of weighted, approximate percentiles of a column within some user-specified error. This algorithm is similar to APPROXIMATE_PERCENTILE [aggregate], which instead returns a single percentile.

The performance of this function depends entirely on the specified epsilon and the size of the provided array.

The OVER clause for this function must be empty.

Behavior type

Immutable

Syntax

APPROXIMATE_QUANTILES ( column USING PARAMETERS [nquantiles=n], [epsilon=error] ) OVER() FROM table

Parameters

column
The INTEGER or FLOAT column for which to calculate the percentiles. NULL values are ignored.
n
An integer that specifies the number of desired quantiles in the returned array.

Default: 11

error
The allowed error for any returned percentile. Specifically, for an array of size N, the specified error ε (epsilon) for the φ-quantile guarantees that the rank r of the return value with respect to the rank ⌊φN⌋ of the exact value is such that:

⌊(φ-ε)N⌋ ≤ r ≤ ⌊(φ+ε)N⌋

For n quantiles, if the error ε is specified such that ε > 1/n, this function will return non-deterministic results.

Default: 0.001

table
The table containing column.

Examples

The following example uses this table:

=> CREATE TABLE allsales(state VARCHAR(20), name VARCHAR(20), sales INT) ORDER BY state;
INSERT INTO allsales VALUES('MA', 'A', 60);
INSERT INTO allsales VALUES('NY', 'B', 20);
INSERT INTO allsales VALUES('NY', 'C', 15);
INSERT INTO allsales VALUES('MA', 'D', 20);
INSERT INTO allsales VALUES('MA', 'E', 50);
INSERT INTO allsales VALUES('NY', 'F', 40);
INSERT INTO allsales VALUES('MA', 'G', 10);
COMMIT;

=> SELECT * FROM allsales;
 state | name | sales
-------+------+-------
 MA    | A    |    60
 NY    | B    |    20
 NY    | C    |    15
 NY    | F    |    40
 MA    | D    |    20
 MA    | E    |    50
 MA    | G    |    10
(7 rows)

This call to APPROXIMATE_QUANTILES returns a 6-element array of approximate percentiles, one for each quantile. Each quantile relates to the percentile by a factor of 100. For example, the second entry in the output indicates that 15 is the 0.2-quantile of the input column, so 15 is the 20th percentile of the input column.

=> SELECT APPROXIMATE_QUANTILES(sales USING PARAMETERS nquantiles=6) OVER() FROM allsales;
 Quantile | Value
----------+-------
        0 |    10
      0.2 |    15
      0.4 |    20
      0.6 |    40
      0.8 |    50
        1 |    60
(6 rows)