This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Time series forecasting

Time series models are trained on stationary time series (that is, time series where the mean doesn't change over time) of stochastic processes with consistent time steps.

Time series models are trained on stationary time series (that is, time series where the mean doesn't change over time) of stochastic processes with consistent time steps. These algorithms forecast future values by taking into account the influence of values at some number of preceding timesteps (lags).

Examples of applicable datasets include those for temperature, stock prices, earthquakes, product sales, etc.

To normalize datasets with inconsistent timesteps, see Gap filling and interpolation (GFI).

1 - Autoregressive model example

Autoregressive models predict future values of a time series based on the preceding values. More specifically, the user-specified lag determines how many previous timesteps it takes into account during computation, and predicted values are linear combinations of the values at each lag.

Use the following functions when training and predicting with autoregressive models. Note that these functions require datasets with consistent timesteps.

To normalize datasets with inconsistent timesteps, see Gap filling and interpolation (GFI).

Example

  1. Load the datasets from the Machine-Learning-Examples repository.

    This example uses the daily-min-temperatures dataset, which contains data on the daily minimum temperature in Melbourne, Australia from 1981 through 1990:

    => SELECT * FROM temp_data;
            time         | Temperature
    ---------------------+-------------
     1981-01-01 00:00:00 |        20.7
     1981-01-02 00:00:00 |        17.9
     1981-01-03 00:00:00 |        18.8
     1981-01-04 00:00:00 |        14.6
     1981-01-05 00:00:00 |        15.8
    ...
     1990-12-27 00:00:00 |          14
     1990-12-28 00:00:00 |        13.6
     1990-12-29 00:00:00 |        13.5
     1990-12-30 00:00:00 |        15.7
     1990-12-31 00:00:00 |          13
    (3650 rows)
    
  2. Use AUTOREGRESSOR to create the autoregressive model AR_temperature from the temp_data dataset. In this case, the model is trained with a lag of p=3, taking the previous 3 entries into account for each estimation:

    => SELECT AUTOREGRESSOR('AR_temperature', 'temp_data', 'Temperature', 'time' USING PARAMETERS p=3);
                        AUTOREGRESSOR
    ---------------------------------------------------------
     Finished. 3650 elements accepted, 0 elements rejected.
    (1 row)
    

    You can view a summary of the model with GET_MODEL_SUMMARY:

    
    => SELECT GET_MODEL_SUMMARY(USING PARAMETERS model_name='AR_temperature');
    
     GET_MODEL_SUMMARY
    -------------------
    
    ============
    coefficients
    ============
    parameter| value
    ---------+--------
      alpha  | 1.88817
    phi_(t-1)| 0.70004
    phi_(t-2)|-0.05940
    phi_(t-3)| 0.19018
    
    
    ==================
    mean_squared_error
    ==================
    not evaluated
    
    ===========
    call_string
    ===========
    autoregressor('public.AR_temperature', 'temp_data', 'temperature', 'time'
    USING PARAMETERS p=3, missing=linear_interpolation, regularization='none', lambda=1, compute_mse=false);
    
    ===============
    Additional Info
    ===============
           Name       |Value
    ------------------+-----
        lag_order     |  3
    rejected_row_count|  0
    accepted_row_count|3650
    (1 row)
    
  3. Use PREDICT_AUTOREGRESSOR to predict future temperatures. The following query starts the prediction at the end of the dataset and returns 10 predictions.

    => SELECT PREDICT_AUTOREGRESSOR(Temperature USING PARAMETERS model_name='AR_temperature', npredictions=10) OVER(ORDER BY time) FROM temp_data;
    
        prediction
    ------------------
     12.6235419917807
     12.9387860506032
     12.6683380680058
     12.3886937385419
     12.2689506237424
     12.1503023330142
     12.0211734746741
     11.9150531529328
     11.825870404008
     11.7451846722395
    (10 rows)
    

2 - Moving-average model example

Moving average models use the errors of previous predictions to make future predictions. More specifically, the user-specified lag determines how many previous predictions and errors it takes into account during computation.

Use the following functions when training and predicting with moving-average models. Note that these functions require datasets with consistent timesteps.

To normalize datasets with inconsistent timesteps, see Gap filling and interpolation (GFI).

Example

  1. Load the datasets from the Machine-Learning-Examples repository.

    This example uses the daily-min-temperatures dataset, which contains data on the daily minimum temperature in Melbourne, Australia from 1981 through 1990:

    => SELECT * FROM temp_data;
            time         | Temperature
    ---------------------+-------------
     1981-01-01 00:00:00 |        20.7
     1981-01-02 00:00:00 |        17.9
     1981-01-03 00:00:00 |        18.8
     1981-01-04 00:00:00 |        14.6
     1981-01-05 00:00:00 |        15.8
    ...
     1990-12-27 00:00:00 |          14
     1990-12-28 00:00:00 |        13.6
     1990-12-29 00:00:00 |        13.5
     1990-12-30 00:00:00 |        15.7
     1990-12-31 00:00:00 |          13
    (3650 rows)
    
  2. Use MOVING_AVERAGE to create the moving-average model MA_temperature from the temp_data dataset. In this case, the model is trained with a lag of p=3, taking the error of 3 previous predictions into account for each estimation:

    => SELECT MOVING_AVERAGE('MA_temperature', 'temp_data', 'temperature', 'time' USING PARAMETERS q=3, missing='linear_interpolation', regularization='none', lambda=1);
                        MOVING_AVERAGE
    ---------------------------------------------------------
     Finished. 3650 elements accepted, 0 elements rejected.
    (1 row)
    

    You can view a summary of the model with GET_MODEL_SUMMARY:

    
    => SELECT GET_MODEL_SUMMARY(USING PARAMETERS model_name='MA_temperature');
    
     GET_MODEL_SUMMARY
    -------------------
    
    ============
    coefficients
    ============
    parameter| value
    ---------+--------
    phi_(t-0)|-0.90051
    phi_(t-1)|-0.10621
    phi_(t-2)| 0.07173
    
    
    ===============
    timeseries_name
    ===============
    temperature
    
    ==============
    timestamp_name
    ==============
    time
    
    ===========
    call_string
    ===========
    moving_average('public.MA_temperature', 'temp_data', 'temperature', 'time'
    USING PARAMETERS q=3, missing=linear_interpolation, regularization='none', lambda=1);
    
    ===============
    Additional Info
    ===============
           Name       | Value
    ------------------+--------
           mean       |11.17780
        lag_order     |   3
          lambda      | 1.00000
    rejected_row_count|   0
    accepted_row_count|  3650
    
    (1 row)
    
  3. Use PREDICT_MOVING_AVERAGE to predict future temperatures. The following query starts the prediction at the end of the dataset and returns 10 predictions.

    => SELECT PREDICT_MOVING_AVERAGE(Temperature USING PARAMETERS model_name='MA_temperature', npredictions=10) OVER(ORDER BY time) FROM temp_data;
    
        prediction
    ------------------
     13.1324365636272
     12.8071086272833
     12.7218966671721
     12.6011086656032
     12.506624729879
     12.4148247026733
     12.3307873804812
     12.2521385975133
     12.1789741993396
     12.1107640076638
    (10 rows)