此部分中的主题演示了如何使用分析查询进行计算。
1 - 计算中间值
中间值是一种数值,它将样本分为上下两部分。例如,您可以将所有观察值从最小值递增排序到最大值,然后选择居于正中间的值,即可获得有限数值列表的中间值。
如果观察值的数目为偶数,则不存在单个中间值;此时中间值是两个中间值的平均值。
以下示例使用此表:
CREATE TABLE allsales(state VARCHAR(20), name VARCHAR(20), sales INT);
INSERT INTO allsales VALUES('MA', 'A', 60);
INSERT INTO allsales VALUES('NY', 'B', 20);
INSERT INTO allsales VALUES('NY', 'C', 15);
INSERT INTO allsales VALUES('MA', 'D', 20);
INSERT INTO allsales VALUES('MA', 'E', 50);
INSERT INTO allsales VALUES('NY', 'F', 40);
INSERT INTO allsales VALUES('MA', 'G', 10);
COMMIT;
您可以使用分析函数
MEDIAN
计算此表中所有销售的中间值。在下述查询中,函数的 OVER
子句为空,因此查询为结果集的每个行返回了相同的聚合值。
=> SELECT name, sales, MEDIAN(sales) OVER() AS median FROM allsales;
name | sales | median
------+-------+--------
G | 10 | 20
C | 15 | 20
D | 20 | 20
B | 20 | 20
F | 40 | 20
E | 50 | 20
A | 60 | 20
(7 rows)
您可以修改此查询,按州对销售进行分组,然后获取每组的中间值。为此,请在 OVER
子句中包含一个窗口分区子句:
=> SELECT state, name, sales, MEDIAN(sales) OVER(partition by state) AS median FROM allsales;
state | name | sales | median
-------+------+-------+--------
MA | G | 10 | 35
MA | D | 20 | 35
MA | E | 50 | 35
MA | A | 60 | 35
NY | C | 15 | 20
NY | B | 20 | 20
NY | F | 40 | 20
(7 rows)
2 - 获取两支股票的价格差异
以下子查询选择了两支感兴趣的股票。外部查询使用分析函数的 LAST_VALUE() 和 OVER() 组件以及 IGNORE NULLS。
架构
DROP TABLE Ticks CASCADE;
CREATE TABLE Ticks (ts TIMESTAMP, Stock varchar(10), Bid float);
INSERT INTO Ticks VALUES('2011-07-12 10:23:54', 'abc', 10.12);
INSERT INTO Ticks VALUES('2011-07-12 10:23:58', 'abc', 10.34);
INSERT INTO Ticks VALUES('2011-07-12 10:23:59', 'abc', 10.75);
INSERT INTO Ticks VALUES('2011-07-12 10:25:15', 'abc', 11.98);
INSERT INTO Ticks VALUES('2011-07-12 10:25:16', 'abc');
INSERT INTO Ticks VALUES('2011-07-12 10:25:22', 'xyz', 45.16);
INSERT INTO Ticks VALUES('2011-07-12 10:25:27', 'xyz', 49.33);
INSERT INTO Ticks VALUES('2011-07-12 10:31:12', 'xyz', 65.25);
INSERT INTO Ticks VALUES('2011-07-12 10:31:15', 'xyz');
COMMIT;
Ticks 表
=> SELECT * FROM ticks;
ts | stock | bid
---------------------+-------+-------
2011-07-12 10:23:59 | abc | 10.75
2011-07-12 10:25:22 | xyz | 45.16
2011-07-12 10:23:58 | abc | 10.34
2011-07-12 10:25:27 | xyz | 49.33
2011-07-12 10:23:54 | abc | 10.12
2011-07-12 10:31:15 | xyz |
2011-07-12 10:25:15 | abc | 11.98
2011-07-12 10:25:16 | abc |
2011-07-12 10:31:12 | xyz | 65.25
(9 rows)
查询
=> SELECT ts, stock, bid, last_value(price1 IGNORE NULLS)
OVER(ORDER BY ts) - last_value(price2 IGNORE NULLS)
OVER(ORDER BY ts) as price_diff
FROM
(SELECT ts, stock, bid,
CASE WHEN stock = 'abc' THEN bid ELSE NULL END AS price1,
CASE WHEN stock = 'xyz' then bid ELSE NULL END AS price2
FROM ticks
WHERE stock IN ('abc','xyz')
) v1
ORDER BY ts;
ts | stock | bid | price_diff
---------------------+-------+-------+------------
2011-07-12 10:23:54 | abc | 10.12 |
2011-07-12 10:23:58 | abc | 10.34 |
2011-07-12 10:23:59 | abc | 10.75 |
2011-07-12 10:25:15 | abc | 11.98 |
2011-07-12 10:25:16 | abc | |
2011-07-12 10:25:22 | xyz | 45.16 | -33.18
2011-07-12 10:25:27 | xyz | 49.33 | -37.35
2011-07-12 10:31:12 | xyz | 65.25 | -53.27
2011-07-12 10:31:15 | xyz | | -53.27
(9 rows)
3 - 计算移动平均数
计算移动平均数有助于估计数据集中的趋势。移动平均数是一段时间内任何数字子集的平均值。例如,如果您拥有跨十年的零售数据,您可以计算三年移动平均数、四年移动平均数等。此示例计算某支股票的出价在 40 秒内的移动平均数。此示例使用
ticks
表架构。
查询
=> SELECT ts, bid, AVG(bid)
OVER(ORDER BY ts
RANGE BETWEEN INTERVAL '40 seconds'
PRECEDING AND CURRENT ROW)
FROM ticks
WHERE stock = 'abc'
GROUP BY bid, ts
ORDER BY ts;
ts | bid | ?column?
---------------------+-------+------------------
2011-07-12 10:23:54 | 10.12 | 10.12
2011-07-12 10:23:58 | 10.34 | 10.23
2011-07-12 10:23:59 | 10.75 | 10.4033333333333
2011-07-12 10:25:15 | 11.98 | 11.98
2011-07-12 10:25:16 | | 11.98
(5 rows)
DROP TABLE Ticks CASCADE;
CREATE TABLE Ticks (ts TIMESTAMP, Stock varchar(10), Bid float);
INSERT INTO Ticks VALUES('2011-07-12 10:23:54', 'abc', 10.12);
INSERT INTO Ticks VALUES('2011-07-12 10:23:58', 'abc', 10.34);
INSERT INTO Ticks VALUES('2011-07-12 10:23:59', 'abc', 10.75);
INSERT INTO Ticks VALUES('2011-07-12 10:25:15', 'abc', 11.98);
INSERT INTO Ticks VALUES('2011-07-12 10:25:16', 'abc');
INSERT INTO Ticks VALUES('2011-07-12 10:25:22', 'xyz', 45.16);
INSERT INTO Ticks VALUES('2011-07-12 10:25:27', 'xyz', 49.33);
INSERT INTO Ticks VALUES('2011-07-12 10:31:12', 'xyz', 65.25);
INSERT INTO Ticks VALUES('2011-07-12 10:31:15', 'xyz');
COMMIT;
4 - 获取最新出价和询价结果
以下查询填充缺失的 NULL 值以创建一个完整的预订订单,按供应商 ID 显示最新的出价和询价以及大小。原始行(通常)具有一个价格和一个大小值,因此每输入一个 ID 条目,就会使用带有“忽略 null 值”的 last_value 来查找另一对的最新非 null 值。Sequenceno 提供唯一的全顺序。
架构:
=> CREATE TABLE bookorders(
vendorid VARCHAR(100),
date TIMESTAMP,
sequenceno INT,
askprice FLOAT,
asksize INT,
bidprice FLOAT,
bidsize INT);
=> INSERT INTO bookorders VALUES('3325XPK','2011-07-12 10:23:54', 1, 10.12, 55, 10.23, 59);
=> INSERT INTO bookorders VALUES('3345XPZ','2011-07-12 10:23:55', 2, 10.55, 58, 10.75, 57);
=> INSERT INTO bookorders VALUES('445XPKF','2011-07-12 10:23:56', 3, 10.22, 43, 54);
=> INSERT INTO bookorders VALUES('445XPKF','2011-07-12 10:23:57', 3, 10.22, 59, 10.25, 61);
=> INSERT INTO bookorders VALUES('3425XPY','2011-07-12 10:23:58', 4, 11.87, 66, 11.90, 66);
=> INSERT INTO bookorders VALUES('3727XVK','2011-07-12 10:23:59', 5, 11.66, 51, 11.67, 62);
=> INSERT INTO bookorders VALUES('5325XYZ','2011-07-12 10:24:01', 6, 15.05, 44, 15.10, 59);
=> INSERT INTO bookorders VALUES('3675XVS','2011-07-12 10:24:05', 7, 15.43, 47, 58);
=> INSERT INTO bookorders VALUES('8972VUG','2011-07-12 10:25:15', 8, 14.95, 52, 15.11, 57);
COMMIT;
查询:
=> SELECT
sequenceno Seq,
date "Time",
vendorid ID,
LAST_VALUE (bidprice IGNORE NULLS)
OVER (PARTITION BY vendorid ORDER BY sequenceno
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
AS "Bid Price",
LAST_VALUE (bidsize IGNORE NULLS)
OVER (PARTITION BY vendorid ORDER BY sequenceno
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
AS "Bid Size",
LAST_VALUE (askprice IGNORE NULLS)
OVER (PARTITION BY vendorid ORDER BY sequenceno
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
AS "Ask Price",
LAST_VALUE (asksize IGNORE NULLS)
OVER (PARTITION BY vendorid order by sequenceno
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW )
AS "Ask Size"
FROM bookorders
ORDER BY sequenceno;
Seq | Time | ID | Bid Price | Bid Size | Ask Price | Ask Size
-----+---------------------+---------+-----------+----------+-----------+----------
1 | 2011-07-12 10:23:54 | 3325XPK | 10.23 | 59 | 10.12 | 55
2 | 2011-07-12 10:23:55 | 3345XPZ | 10.75 | 57 | 10.55 | 58
3 | 2011-07-12 10:23:57 | 445XPKF | 10.25 | 61 | 10.22 | 59
3 | 2011-07-12 10:23:56 | 445XPKF | 54 | | 10.22 | 43
4 | 2011-07-12 10:23:58 | 3425XPY | 11.9 | 66 | 11.87 | 66
5 | 2011-07-12 10:23:59 | 3727XVK | 11.67 | 62 | 11.66 | 51
6 | 2011-07-12 10:24:01 | 5325XYZ | 15.1 | 59 | 15.05 | 44
7 | 2011-07-12 10:24:05 | 3675XVS | 58 | | 15.43 | 47
8 | 2011-07-12 10:25:15 | 8972VUG | 15.11 | 57 | 14.95 | 52
(9 rows)