<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>OpenText Analytics Database 26.2.x – Time series analytics</title>
    <link>/en/data-analysis/time-series-analytics/</link>
    <description>Recent content in Time series analytics on OpenText Analytics Database 26.2.x</description>
    <generator>Hugo -- gohugo.io</generator>
    
	  <atom:link href="/en/data-analysis/time-series-analytics/index.xml" rel="self" type="application/rss+xml" />
    
    
      
        
      
    
    
    <item>
      <title>Data-Analysis: Gap filling and interpolation (GFI)</title>
      <link>/en/data-analysis/time-series-analytics/gap-filling-and-interpolation-gfi/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/data-analysis/time-series-analytics/gap-filling-and-interpolation-gfi/</guid>
      <description>
        
        
        &lt;p&gt;The examples and graphics that explain the concepts in this topic use the following simple schema:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;CREATE TABLE TickStore (ts TIMESTAMP, symbol VARCHAR(8), bid FLOAT);
INSERT INTO TickStore VALUES (&amp;#39;2009-01-01 03:00:00&amp;#39;, &amp;#39;XYZ&amp;#39;, 10.0);
INSERT INTO TickStore VALUES (&amp;#39;2009-01-01 03:00:05&amp;#39;, &amp;#39;XYZ&amp;#39;, 10.5);
COMMIT;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;In OpenText™ Analytics Database, time series data is represented by a sequence of rows that conforms to a particular table schema, where one of the columns stores the time information.&lt;/p&gt;
&lt;p&gt;Both time and the state of data within a time series are continuous. Thus, evaluating SQL queries over time can be challenging because input records usually occur at non-uniform intervals and can contain gaps.&lt;/p&gt;
&lt;p&gt;For example, the following table contains two input rows five seconds apart: 3:00:00 and 3:00:05.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT * FROM TickStore;
         ts          | symbol | bid
---------------------+--------+------
 2009-01-01 03:00:00 | XYZ    |   10
 2009-01-01 03:00:05 | XYZ    | 10.5
(2 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Given those two inputs, how can you determine a bid price that falls between the two points, such as at 3:00:03 PM?

The 
&lt;code&gt;&lt;a href=&#34;../../../en/sql-reference/functions/data-type-specific-functions/datetime-functions/time-slice/#&#34;&gt;TIME_SLICE&lt;/a&gt;&lt;/code&gt; function normalizes timestamps into corresponding time slices; however, &lt;code&gt;TIME_SLICE&lt;/code&gt; does not solve the problem of missing inputs (time slices) in the data. Instead, the database provides gap-filling and interpolation (GFI) functionality, which fills in missing data points and adds new (missing) data points within a range of known data points to the output. It accomplishes these tasks with the time series aggregate functions (&lt;a href=&#34;../../../en/sql-reference/functions/aggregate-functions/ts-first-value/#&#34;&gt;TS_FIRST_VALUE&lt;/a&gt; and &lt;a href=&#34;../../../en/sql-reference/functions/aggregate-functions/ts-last-value/#&#34;&gt;TS_LAST_VALUE&lt;/a&gt;) and the SQL &lt;a href=&#34;../../../en/sql-reference/statements/select/timeseries-clause/#&#34;&gt;TIMESERIES clause&lt;/a&gt;.

But first, we&#39;ll illustrate the components that make up gap filling and interpolation in the database, starting with &lt;a href=&#34;../../../en/data-analysis/time-series-analytics/gap-filling-and-interpolation-gfi/constant-interpolation/#&#34;&gt;Constant interpolation&lt;/a&gt;.

The images in the following topics use the following legend:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;The x-axis represents the timestamp (&lt;code&gt;ts&lt;/code&gt;) column&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The y-axis represents the bid column.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The vertical blue lines delimit the time slices.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The red dots represent the input records in the table, $10.0 and $10.5.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The blue stars represent the output values, including interpolated values.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Data-Analysis: Null values in time series data</title>
      <link>/en/data-analysis/time-series-analytics/null-values-time-series-data/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/data-analysis/time-series-analytics/null-values-time-series-data/</guid>
      <description>
        
        
        &lt;p&gt;Null values are uncommon inputs for gap-filling and interpolation (GFI) computation. When null values exist, you can use time series aggregate (TSA) functions 
&lt;code&gt;&lt;a href=&#34;../../../en/sql-reference/functions/aggregate-functions/ts-first-value/#&#34;&gt;TS_FIRST_VALUE&lt;/a&gt;&lt;/code&gt; and 
&lt;code&gt;&lt;a href=&#34;../../../en/sql-reference/functions/aggregate-functions/ts-last-value/#&#34;&gt;TS_LAST_VALUE&lt;/a&gt;&lt;/code&gt; with &lt;code&gt;IGNORE NULLS&lt;/code&gt; to affect output of the interpolated values. TSA functions are treated like their analytic counterparts 
&lt;code&gt;&lt;a href=&#34;../../../en/sql-reference/functions/analytic-functions/first-value-analytic/#&#34;&gt;FIRST_VALUE&lt;/a&gt;&lt;/code&gt; and 
&lt;code&gt;&lt;a href=&#34;../../../en/sql-reference/functions/analytic-functions/last-value-analytic/#&#34;&gt;LAST_VALUE&lt;/a&gt;&lt;/code&gt;: if the timestamp itself is null, the database filters out those rows before gap filling and interpolation occurs.&lt;/p&gt;
&lt;h2 id=&#34;constant-interpolation-with-null-values&#34;&gt;Constant interpolation with null values&lt;/h2&gt;
&lt;p&gt;Figure 1 illustrates a default (constant) interpolation result on four input rows where none of the inputs contains a NULL value:

&lt;table class=&#34;table table-bordered&#34; &gt;



&lt;tr&gt; 

&lt;td &gt;
&lt;img src=&#34;../../../images/when-time-series-data-contains-nulls3.png&#34; alt=&#34;&#34;&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/p&gt;
&lt;p&gt;Figure 2 shows the same input rows with the addition of another input record whose bid value is NULL, and whose timestamp (ts) value is 3:00:03:

&lt;table class=&#34;table table-bordered&#34; &gt;



&lt;tr&gt; 

&lt;td &gt;
&lt;img src=&#34;../../../images/when-time-series-data-contains-nulls6.png&#34; alt=&#34;&#34;&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/p&gt;
&lt;p&gt;For constant interpolation, the bid value starting at 3:00:03 is null until the next non-null bid value appears in time. In Figure 2, the presence of the null row makes the interpolated bid value null in the time interval denoted by the shaded region. If &lt;code&gt;TS_FIRST_VALUE(bid)&lt;/code&gt; is evaluated with constant interpolation on the time slice that begins at 3:00:02, its output is non-null. However, &lt;code&gt;TS_FIRST_VALUE(bid)&lt;/code&gt; on the next time slice produces null. If the last value of the 3:00:02 time slice is null, the first value for the next time slice (3:00:04) is null. However, if you use a TSA function with &lt;code&gt;IGNORE NULLS&lt;/code&gt;, then the value at 3:00:04 is the same value as it was at 3:00:02.&lt;/p&gt;
&lt;p&gt;To illustrate, insert a new row into the TickStore table at 03:00:03 with a null bid value, the database outputs a row for the 03:00:02 record with a null value but no row for the 03:00:03 input:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; INSERT INTO tickstore VALUES(&amp;#39;2009-01-01 03:00:03&amp;#39;, &amp;#39;XYZ&amp;#39;, NULL);
=&amp;gt; SELECT slice_time, symbol, TS_LAST_VALUE(bid) AS last_bid FROM TickStore
-&amp;gt; TIMESERIES slice_time AS &amp;#39;2 seconds&amp;#39; OVER (PARTITION BY symbol ORDER BY ts);
     slice_time      | symbol | last_bid
---------------------+--------+----------
 2009-01-01 03:00:00 | XYZ    |       10
 2009-01-01 03:00:02 | XYZ    |
 2009-01-01 03:00:04 | XYZ    |     10.5
(3 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;If you specify IGNORE NULLS, the database fills in the missing data point using a constant interpolation scheme. Here, the bid price at 03:00:02 is interpolated to the last known input record for bid, which was $10 at 03:00:00:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT slice_time, symbol, TS_LAST_VALUE(bid IGNORE NULLS) AS last_bid FROM TickStore
     TIMESERIES slice_time AS &amp;#39;2 seconds&amp;#39; OVER (PARTITION BY symbol ORDER BY ts);
     slice_time      | symbol | last_bid
---------------------+--------+----------
 2009-01-01 03:00:00 | XYZ    |       10
 2009-01-01 03:00:02 | XYZ    |       10
 2009-01-01 03:00:04 | XYZ    |     10.5
(3 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Now, if you insert a row where the timestamp column contains a null value, the database filters out that row before gap filling and interpolation occurred.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; INSERT INTO tickstore VALUES(NULL, &amp;#39;XYZ&amp;#39;, 11.2);
=&amp;gt; SELECT slice_time, symbol, TS_LAST_VALUE(bid) AS last_bid FROM TickStore
     TIMESERIES slice_time AS &amp;#39;2 seconds&amp;#39; OVER (PARTITION BY symbol ORDER BY ts);
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Notice there is no output for the 11.2 bid row:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;     slice_time      | symbol | last_bid
---------------------+--------+----------
 2009-01-01 03:00:00 | XYZ    |       10
 2009-01-01 03:00:02 | XYZ    |
 2009-01-01 03:00:04 | XYZ    |     10.5
(3 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;linear-interpolation-with-null-values&#34;&gt;Linear interpolation with null values&lt;/h2&gt;
&lt;p&gt;For linear interpolation, the interpolated bid value becomes null in the time interval, represented by the shaded region in Figure 3:

&lt;table class=&#34;table table-bordered&#34; &gt;



&lt;tr&gt; 

&lt;td &gt;
&lt;img src=&#34;../../../images/when-time-series-data-contains-nulls4.png&#34; alt=&#34;&#34;&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/p&gt;
&lt;p&gt;In the presence of an input null value at 3:00:03, the database cannot linearly interpolate the bid value around that time point.&lt;/p&gt;
&lt;p&gt;The database takes the closest non null value on either side of the time slice and uses that value. For example, if you use a linear interpolation scheme and do not specify &lt;code&gt;IGNORE NULLS&lt;/code&gt;, and your data has one real value and one null, the result is null. If the value on either side is null, the result is null. Therefore, to evaluate &lt;code&gt;TS_FIRST_VALUE(bid)&lt;/code&gt; with linear interpolation on the time slice that begins at 3:00:02, its output is null. &lt;code&gt;TS_FIRST_VALUE(bid)&lt;/code&gt; on the next time slice remains null.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT slice_time, symbol, TS_FIRST_VALUE(bid, &amp;#39;linear&amp;#39;) AS fv_l FROM TickStore
     TIMESERIES slice_time AS &amp;#39;2 seconds&amp;#39; OVER (PARTITION BY symbol ORDER BY ts);
     slice_time      | symbol | fv_l
---------------------+--------+------
 2009-01-01 03:00:00 | XYZ    |   10
 2009-01-01 03:00:02 | XYZ    |
 2009-01-01 03:00:04 | XYZ    |
(3 rows)
&lt;/code&gt;&lt;/pre&gt;
      </description>
    </item>
    
  </channel>
</rss>
