INTERPOLATE
Joins two event series using some ordered attribute. Event series joins let you compare values from two series directly, rather than having to normalize the series to the same measurement interval.
An event series join is an extension of a regular outer join. The difference between expressing a regular outer join and an event series join is the INTERPOLATE predicate, which is used in the ON clause (see Examples below). Instead of padding the non-preserved side with null values when there is no match, the event series join pads the non-preserved side with the previous/next values from the table.
Interpolated values come from the table that contains the null, not from the other table.Vertica does not guarantee that the output contains no null values. If there is no previous/next value for a mismatched row, that row is padded with nulls.
Syntax
expression1 INTERPOLATE { PREVIOUS | NEXT } VALUE expression2
Arguments
expression1 ,expression2 |
A column reference from one of the tables specified in the FROM clause. The columns can be of any data type. Because event series are time-based, the type is typically |
{ PREVIOUS | NEXT } VALUE |
Pads the non-preserved side with the previous/next values when there is no match. If previous is called on the first row (or next on the last row), will pad with null values. Input rows are sorted in ascending logical order of the join column. NoteAn ORDER BY clause, if used, does not determine the input order but only determines query output order. |
Notes
-
Data is logically partitioned on the table in which it resides, based on other ON clause equality predicates.
-
Event series join requires that the joined tables are both sorted on columns in equality predicates, in any order, followed by the INTERPOLATED column. If data is already sorted in this order, then an explicit sort is avoided, which can improve query performance. For example, given the following tables:
ask: exchange, stock, ts, pricebid: exchange, stock, ts, price
In the query that follows:
-
ask
is sorted onexchange, stock
(or the reverse),ts
-
bid
is sorted onexchange, stock
(or the reverse),ts
SELECT ask.price - bid.price, ask.ts, ask.stock, ask.exchange FROM ask FULL OUTER JOIN bid ON ask.stock = bid.stock AND ask.exchange = bid.exchange AND ask.ts INTERPOLATE PREVIOUS VALUE bid.ts;
-
Restrictions
- Only one INTERPOLATE expression is allowed per join.
- INTERPOLATE expressions are used only with ANSI SQL-99 syntax (the ON clause), which is already true for full outer joins.
- INTERPOLATE can be used with equality predicates only.
- The AND operator is supported but not the OR and NOT operators.
- Expressions and implicit or explicit casts are not supported, but subqueries are allowed.
Semantics
When you write an event series join in place of normal join, values are evaluated as follows (using the schema in the examples below):
t
is the outer, preserved table.t1
is the inner, non-preserved table.- For each row in outer table
t
, the ON clause predicates are evaluated for each combination of each row in the inner tablet1
. - If the ON clause predicates evaluate to true for any combination of rows, those combination rows are produced at the output.
- If the ON clause is false for all combinations, a single output row is produced with the values of the row from
t
along with the columns oft1
chosen from the row int1
with the greatestt1.y
value such thatt1.y < t.x
; If no such row is found, pad with nulls.Note
t LEFT OUTER JOIN t1
is equivalent tot1 RIGHT OUTER JOIN t
.
In the case of a full outer join, all values from both tables are preserved.
Examples
The examples that follow use this simple schema.
CREATE TABLE t(x TIME);
CREATE TABLE t1(y TIME);
INSERT INTO t VALUES('12:40:23');
INSERT INTO t VALUES('13:40:25');
INSERT INTO t VALUES('13:45:00');
INSERT INTO t VALUES('14:49:55');
INSERT INTO t1 VALUES('12:40:23');
INSERT INTO t1 VALUES('14:00:00');
COMMIT;
Normal full outer join
=> SELECT * FROM t FULL OUTER JOIN t1 ON t.x = t1.y;
Notice the null rows from the non-preserved table:
x | y
----------+----------
12:40:23 | 12:40:23
13:40:25 |
13:45:00 |
14:49:55 |
| 14:00:00
(5 rows)
Full outer join with interpolation
=> SELECT * FROM t FULL OUTER JOIN t1 ON t.x INTERPOLATE PREVIOUS VALUE t1.y;
In this case, the rows with no entry point are padded with values from the previous row.
x | y
----------+----------
12:40:23 | 12:40:23
13:40:25 | 12:40:23
13:45:00 | 12:40:23
14:49:55 | 12:40:23
13:40:25 | 14:00:00
(5 rows)
Likewise, interpolate next is also supported:
=> SELECT * FROM t FULL OUTER JOIN t1 ON t.x INTERPOLATE NEXT VALUE t1.y;
In this case, the rows with no entry point are padded with values from the next row.
x | y
----------+----------
12:40:23 | 12:40:23
13:40:25 | 14:00:00
13:45:00 | 14:00:00
14:49:55 |
14:49:55 | 14:00:00
(5 rows)
Normal left outer join
=> SELECT * FROM t LEFT OUTER JOIN t1 ON t.x = t1.y;
Again, there are nulls in the non-preserved table
x | y
----------+----------
12:40:23 | 12:40:23
13:40:25 |
13:45:00 |
14:49:55 |
(4 rows)
Left outer join with interpolation
=> SELECT * FROM t LEFT OUTER JOIN t1 ON t.x INTERPOLATE PREVIOUS VALUE t1.y;
Nulls have been padded with interpolated values.
x | y
----------+----------
12:40:23 | 12:40:23
13:40:25 | 12:40:23
13:45:00 | 12:40:23
14:49:55 | 14:00:00
(4 rows)
Likewise, interpolate next is also supported:
=> SELECT * FROM t LEFT OUTER JOIN t1 ON t.x INTERPOLATE NEXT VALUE t1.y;
Nulls have been padded with interpolated values here as well.
x | y
----------+----------
12:40:23 | 12:40:23
13:40:25 | 14:00:00
13:45:00 | 14:00:00
14:49:55 |
(4 rows)
Inner joins
For inner joins, there is no difference between a regular inner join and an event series inner join. Since null values are eliminated from the result set, there is nothing to interpolate.
A regular inner join returns only the single matching row at 12:40:23:
=> SELECT * FROM t INNER JOIN t1 ON t.x = t1.y;
x | y
----------+----------
12:40:23 | 12:40:23
(1 row)
An event series inner join finds the same single-matching row at 12:40:23:
=> SELECT * FROM t INNER JOIN t1 ON t.x INTERPOLATE PREVIOUS VALUE t1.y;
x | y
----------+----------
12:40:23 | 12:40:23
(1 row)