PERCENTILE_CONT [analytic]

一个逆分布函数,在此函数中,对于每行,PERCENTILE_CONT 将返回 窗口内每个分区中的一组值中落入指定百分位的值。例如,如果函数的实参值为 0.5,则函数的结果为数据集的中间值(第 50 个百分位)。PERCENTILE_CONT 假设了连续分布的数据模型。忽略 NULL 值。

PERCENTILE_CONT 通过首先计算百分位存在的行的行数来计算百分位。例如:

row‑number = 1 + percentile‑value * (num‑partition‑rows -1)

如果 row‑number 为完整数(误差范围 0.00001),则百分位为行 row‑number 的值。

否则,Vertica 会在 CEILING(row‑number) 行的值和 FLOOR(row‑number) 行的值之间插入百分位值。换句话说,将按以下方式计算百分位:

  ( CEILING( row‑number) - row‑number ) * ( value of FLOOR(row‑number) row )
+ ( row‑number - FLOOR(row‑number) ) * ( value of CEILING(row‑number) row)

行为类型

不可变

语法

PERCENTILE_CONT ( percentile ) WITHIN GROUP ( ORDER BY expression [ ASC | DESC ] ) OVER ( [ window-partition-clause ] )

参数

percentile
百分位值,为 0 到 1(包含)范围内的 FLOAT 常数。
WITHIN GROUP (ORDER BY expression)
指定如何对每个组内的数据进行排序。ORDER BY 仅使用数据类型为 INTEGER、FLOAT、INTERVAL 或 NUMERIC 的列/表达式。NULL 值将被丢弃。

WITHIN GROUP(ORDER BY) 子句不能保证 SQL 结果的顺序。要对最终结果进行排序,请使用 SQL ORDER BY 子句集。

ASC | DESC
将排序顺序指定为升序(默认值)或降序。

只要 percentile 不是 0.5,在 WITHIN GROUP 子句中指定 ASC 或 DESC 会影响到结果。

OVER()
请参阅分析函数

示例

该查询会计算威斯康星和哥伦比亚区的前 300 位客户的每组的年收入中间值。

=> SELECT customer_state, customer_key, annual_income, PERCENTILE_CONT(0.5) WITHIN GROUP(ORDER BY annual_income)
      OVER (PARTITION BY customer_state) AS PERCENTILE_CONT
   FROM customer_dimension WHERE customer_state IN ('DC','WI') AND customer_key < 300
   ORDER BY customer_state, customer_key;
 customer_state | customer_key | annual_income | PERCENTILE_CONT
----------------+--------------+---------------+-----------------
 DC             |           52 |        168312 |        483266.5
 DC             |          118 |        798221 |        483266.5
 WI             |           62 |        283043 |          377691
 WI             |          139 |        472339 |          377691
(4 rows)

该查询会计算威斯康星和哥伦比亚区全部客户的每组的年收入中间值。

=> SELECT customer_state, customer_key, annual_income, PERCENTILE_CONT(0.5) WITHIN GROUP(ORDER BY annual_income)
      OVER (PARTITION BY customer_state) AS PERCENTILE_CONT
   FROM customer_dimension WHERE customer_state IN ('DC','WI') ORDER BY customer_state, customer_key;
 customer_state | customer_key | annual_income | PERCENTILE_CONT
----------------+--------------+---------------+-----------------
 DC             |           52 |        168312 |        483266.5
 DC             |          118 |        798221 |        483266.5
 DC             |          622 |        220782 |          555088
 DC             |          951 |        178453 |          555088
 DC             |          972 |        961582 |          555088
 DC             |         1286 |        760445 |          555088
 DC             |         1434 |         44836 |          555088
 ...

 WI             |           62 |        283043 |          377691
 WI             |          139 |        472339 |          377691
 WI             |          359 |         42242 |          517717
 WI             |          364 |        867543 |          517717
 WI             |          403 |        509031 |          517717
 WI             |          455 |         32000 |          517717
 WI             |          485 |        373129 |          517717
 ...

(1353 rows)

另请参阅