这是本节的多页打印视图。 点击此处打印.

返回本页常规视图.

分析查询示例

此部分中的主题演示了如何使用分析查询进行计算。

1 - 计算中间值

中间值是一种数值,它将样本分为上下两部分。例如,您可以将所有观察值从最小值递增排序到最大值,然后选择居于正中间的值,即可获得有限数值列表的中间值。

如果观察值的数目为偶数,则不存在单个中间值;此时中间值是两个中间值的平均值。

以下示例使用此表:

CREATE TABLE allsales(state VARCHAR(20), name VARCHAR(20), sales INT);
INSERT INTO allsales VALUES('MA', 'A', 60);
INSERT INTO allsales VALUES('NY', 'B', 20);
INSERT INTO allsales VALUES('NY', 'C', 15);
INSERT INTO allsales VALUES('MA', 'D', 20);
INSERT INTO allsales VALUES('MA', 'E', 50);
INSERT INTO allsales VALUES('NY', 'F', 40);
INSERT INTO allsales VALUES('MA', 'G', 10);
COMMIT;

您可以使用分析函数 MEDIAN 计算此表中所有销售的中间值。在下述查询中,函数的 OVER 子句为空,因此查询为结果集的每个行返回了相同的聚合值。

=> SELECT name, sales, MEDIAN(sales) OVER() AS median FROM allsales;
 name | sales | median
------+-------+--------
 G    |    10 |     20
 C    |    15 |     20
 D    |    20 |     20
 B    |    20 |     20
 F    |    40 |     20
 E    |    50 |     20
 A    |    60 |     20
(7 rows)

您可以修改此查询,按州对销售进行分组,然后获取每组的中间值。为此,请在 OVER 子句中包含一个窗口分区子句:

=> SELECT state, name, sales, MEDIAN(sales) OVER(partition by state) AS median FROM allsales;
 state | name | sales | median
-------+------+-------+--------
 MA    | G    |    10 |     35
 MA    | D    |    20 |     35
 MA    | E    |    50 |     35
 MA    | A    |    60 |     35
 NY    | C    |    15 |     20
 NY    | B    |    20 |     20
 NY    | F    |    40 |     20
(7 rows)

2 - 获取两支股票的价格差异

以下子查询选择了两支感兴趣的股票。外部查询使用分析函数的 LAST_VALUE() 和 OVER() 组件以及 IGNORE NULLS。

架构

DROP TABLE Ticks CASCADE;


CREATE TABLE Ticks (ts TIMESTAMP, Stock varchar(10), Bid float);
INSERT INTO Ticks VALUES('2011-07-12 10:23:54', 'abc', 10.12);
INSERT INTO Ticks VALUES('2011-07-12 10:23:58', 'abc', 10.34);
INSERT INTO Ticks VALUES('2011-07-12 10:23:59', 'abc', 10.75);
INSERT INTO Ticks VALUES('2011-07-12 10:25:15', 'abc', 11.98);
INSERT INTO Ticks VALUES('2011-07-12 10:25:16', 'abc');
INSERT INTO Ticks VALUES('2011-07-12 10:25:22', 'xyz', 45.16);
INSERT INTO Ticks VALUES('2011-07-12 10:25:27', 'xyz', 49.33);
INSERT INTO Ticks VALUES('2011-07-12 10:31:12', 'xyz', 65.25);
INSERT INTO Ticks VALUES('2011-07-12 10:31:15', 'xyz');

COMMIT;

Ticks 表

=> SELECT * FROM ticks;
         ts          | stock |  bid
---------------------+-------+-------
 2011-07-12 10:23:59 | abc   | 10.75
 2011-07-12 10:25:22 | xyz   | 45.16
 2011-07-12 10:23:58 | abc   | 10.34
 2011-07-12 10:25:27 | xyz   | 49.33
 2011-07-12 10:23:54 | abc   | 10.12
 2011-07-12 10:31:15 | xyz   |
 2011-07-12 10:25:15 | abc   | 11.98
 2011-07-12 10:25:16 | abc   |
 2011-07-12 10:31:12 | xyz   | 65.25
(9 rows)

查询

 => SELECT ts, stock, bid, last_value(price1 IGNORE NULLS)
   OVER(ORDER BY ts) - last_value(price2 IGNORE NULLS)
   OVER(ORDER BY ts)   as price_diff
 FROM
  (SELECT ts, stock, bid,
    CASE WHEN stock = 'abc' THEN bid ELSE NULL END AS price1,
    CASE WHEN stock = 'xyz' then bid ELSE NULL END AS price2
    FROM ticks
    WHERE stock IN ('abc','xyz')
  ) v1
 ORDER BY ts;
         ts          | stock |  bid  | price_diff
---------------------+-------+-------+------------
 2011-07-12 10:23:54 | abc   | 10.12 |
 2011-07-12 10:23:58 | abc   | 10.34 |
 2011-07-12 10:23:59 | abc   | 10.75 |
 2011-07-12 10:25:15 | abc   | 11.98 |
 2011-07-12 10:25:16 | abc   |       |
 2011-07-12 10:25:22 | xyz   | 45.16 |     -33.18
 2011-07-12 10:25:27 | xyz   | 49.33 |     -37.35
 2011-07-12 10:31:12 | xyz   | 65.25 |     -53.27
 2011-07-12 10:31:15 | xyz   |       |     -53.27
(9 rows)

3 - 计算移动平均数

计算移动平均数有助于估计数据集中的趋势。移动平均数是一段时间内任何数字子集的平均值。例如,如果您拥有跨十年的零售数据,您可以计算三年移动平均数、四年移动平均数等。此示例计算某支股票的出价在 40 秒内的移动平均数。此示例使用 ticks 表架构。

查询

=> SELECT ts, bid, AVG(bid)
   OVER(ORDER BY ts
       RANGE BETWEEN INTERVAL '40 seconds'
       PRECEDING AND CURRENT ROW)
FROM ticks
WHERE stock = 'abc'
GROUP BY bid, ts
ORDER BY ts;
         ts          |  bid  |     ?column?
---------------------+-------+------------------
 2011-07-12 10:23:54 | 10.12 |            10.12
 2011-07-12 10:23:58 | 10.34 |            10.23
 2011-07-12 10:23:59 | 10.75 | 10.4033333333333
 2011-07-12 10:25:15 | 11.98 |            11.98
 2011-07-12 10:25:16 |       |            11.98
(5 rows)
DROP TABLE Ticks CASCADE;


CREATE TABLE Ticks (ts TIMESTAMP, Stock varchar(10), Bid float);
INSERT INTO Ticks VALUES('2011-07-12 10:23:54', 'abc', 10.12);
INSERT INTO Ticks VALUES('2011-07-12 10:23:58', 'abc', 10.34);
INSERT INTO Ticks VALUES('2011-07-12 10:23:59', 'abc', 10.75);
INSERT INTO Ticks VALUES('2011-07-12 10:25:15', 'abc', 11.98);
INSERT INTO Ticks VALUES('2011-07-12 10:25:16', 'abc');
INSERT INTO Ticks VALUES('2011-07-12 10:25:22', 'xyz', 45.16);
INSERT INTO Ticks VALUES('2011-07-12 10:25:27', 'xyz', 49.33);
INSERT INTO Ticks VALUES('2011-07-12 10:31:12', 'xyz', 65.25);
INSERT INTO Ticks VALUES('2011-07-12 10:31:15', 'xyz');

COMMIT;

4 - 获取最新出价和询价结果

以下查询填充缺失的 NULL 值以创建一个完整的预订订单,按供应商 ID 显示最新的出价和询价以及大小。原始行(通常)具有一个价格和一个大小值,因此每输入一个 ID 条目,就会使用带有“忽略 null 值”的 last_value 来查找另一对的最新非 null 值。Sequenceno 提供唯一的全顺序。

架构:

=> CREATE TABLE bookorders(
    vendorid VARCHAR(100),
    date TIMESTAMP,
    sequenceno INT,
    askprice FLOAT,
    asksize INT,
    bidprice FLOAT,
    bidsize INT);
=> INSERT INTO bookorders VALUES('3325XPK','2011-07-12 10:23:54', 1, 10.12, 55, 10.23, 59);
=> INSERT INTO bookorders VALUES('3345XPZ','2011-07-12 10:23:55', 2, 10.55, 58, 10.75, 57);
=> INSERT INTO bookorders VALUES('445XPKF','2011-07-12 10:23:56', 3, 10.22, 43, 54);
=> INSERT INTO bookorders VALUES('445XPKF','2011-07-12 10:23:57', 3, 10.22, 59, 10.25, 61);
=> INSERT INTO bookorders VALUES('3425XPY','2011-07-12 10:23:58', 4, 11.87, 66, 11.90, 66);
=> INSERT INTO bookorders VALUES('3727XVK','2011-07-12 10:23:59', 5, 11.66, 51, 11.67, 62);
=> INSERT INTO bookorders VALUES('5325XYZ','2011-07-12 10:24:01', 6, 15.05, 44, 15.10, 59);
=> INSERT INTO bookorders VALUES('3675XVS','2011-07-12 10:24:05', 7, 15.43, 47, 58);
=> INSERT INTO bookorders VALUES('8972VUG','2011-07-12 10:25:15', 8, 14.95, 52, 15.11, 57);
COMMIT;

查询:

=> SELECT
    sequenceno Seq,
    date "Time",
    vendorid ID,
    LAST_VALUE (bidprice IGNORE NULLS)
     OVER (PARTITION BY vendorid ORDER BY sequenceno
           ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
    AS "Bid Price",
    LAST_VALUE (bidsize IGNORE NULLS)
     OVER (PARTITION BY vendorid ORDER BY sequenceno
           ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
    AS "Bid Size",
    LAST_VALUE (askprice IGNORE NULLS)
     OVER (PARTITION BY vendorid ORDER BY sequenceno
           ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
    AS "Ask Price",
    LAST_VALUE (asksize IGNORE NULLS)
     OVER (PARTITION BY vendorid order by sequenceno
           ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW )
    AS  "Ask Size"
 FROM bookorders
ORDER BY sequenceno;
 Seq |        Time         |   ID    | Bid Price | Bid Size | Ask Price | Ask Size
-----+---------------------+---------+-----------+----------+-----------+----------
   1 | 2011-07-12 10:23:54 | 3325XPK |     10.23 |       59 |     10.12 |       55
   2 | 2011-07-12 10:23:55 | 3345XPZ |     10.75 |       57 |     10.55 |       58
   3 | 2011-07-12 10:23:57 | 445XPKF |     10.25 |       61 |     10.22 |       59
   3 | 2011-07-12 10:23:56 | 445XPKF |        54 |          |     10.22 |       43
   4 | 2011-07-12 10:23:58 | 3425XPY |      11.9 |       66 |     11.87 |       66
   5 | 2011-07-12 10:23:59 | 3727XVK |     11.67 |       62 |     11.66 |       51
   6 | 2011-07-12 10:24:01 | 5325XYZ |      15.1 |       59 |     15.05 |       44
   7 | 2011-07-12 10:24:05 | 3675XVS |        58 |          |     15.43 |       47
   8 | 2011-07-12 10:25:15 | 8972VUG |     15.11 |       57 |     14.95 |       52
(9 rows)