PERCENTILE_DISC [analytic]
An inverse distribution function where, for each row, PERCENTILE_DISC
returns the value that would fall into the specified percentile among a set of values in each partition within a window. PERCENTILE_DISC()
assumes a discrete distribution data model. NULL
values are ignored.
PERCENTILE_DISC
examines the cumulative distribution values in each group until it finds one that is greater than or equal to the specified percentile. Vertica computes the percentile where, for each row, PERCENTILE_DISC
outputs the first value of the WITHIN GROUP(ORDER BY)
column whose CUME_DIST
(cumulative distribution) value is >= the argument FLOAT
value—for example, 0.4
:
PERCENTILE_DIST(0.4) WITHIN GROUP (ORDER BY salary) OVER(PARTITION BY deptno)...
Given the following query:
SELECT CUME_DIST() OVER(ORDER BY salary) FROM table-name;
The smallest CUME_DIST
value that is greater than 0.4 is also the PERCENTILE_DISC
.
Behavior type
ImmutableSyntax
PERCENTILE_DISC ( percentile ) WITHIN GROUP (
ORDER BY expression [ ASC | DESC ] ) OVER (
[ window-partition-clause ] )
Parameters
percentile
- Percentile value, a
FLOAT
constant that ranges from 0 to 1 (inclusive). WITHIN GROUP(ORDER BY
expression
)
- Specifies how to sort data within each group.
ORDER BY
takes only one column/expression that must beINTEGER
,FLOAT
,INTERVAL
, orNUMERIC
data type.NULL
values are discarded.The
WITHIN GROUP(ORDER BY)
clause does not guarantee the order of the SQL result. To order the final result , use the SQLORDER BY
clause set. ASC | DESC
- Specifies the ordering sequence as ascending (default) or descending.
OVER()
- See Analytic Functions
Examples
This query computes the 20th percentile annual income by group for first 300 customers in Wisconsin and the District of Columbia.
=> SELECT customer_state, customer_key, annual_income,
PERCENTILE_DISC(.2) WITHIN GROUP(ORDER BY annual_income)
OVER (PARTITION BY customer_state) AS PERCENTILE_DISC
FROM customer_dimension
WHERE customer_state IN ('DC','WI')
AND customer_key < 300
ORDER BY customer_state, customer_key;
customer_state | customer_key | annual_income | PERCENTILE_DISC
----------------+--------------+---------------+-----------------
DC | 104 | 658383 | 417092
DC | 168 | 417092 | 417092
DC | 245 | 670205 | 417092
WI | 106 | 227279 | 227279
WI | 127 | 703889 | 227279
WI | 209 | 458607 | 227279
(6 rows)