PREDICT_ARIMA
Applies an autoregressive integrated moving average (ARIMA) model to an input relation or makes predictions using the in-sample data. ARIMA models make predictions based on preceding time series values and errors of previous predictions. The function, by default, returns the predicted values plus the mean of the model.
Behavior type
ImmutableSyntax
Apply to an input relation:
PREDICT_ARIMA ( 'timeseries-column'
USING PARAMETERS param=value[,...] )
OVER (ORDER BY 'timestamp-column')
FROM input-relation
Make predictions using the in-sample data:
PREDICT_ARIMA ( USING PARAMETERS model_name = 'ARIMA-model'
[, start = prediction-start ]
[, npredictions = num-predictions ]
[, output_standard_errors = boolean ] )
OVER ()
Arguments
timeseries-column
- Name of a NUMERIC column in
input-relation
used to make predictions. timestamp-column
- Name of an INTEGER, FLOAT, or TIMESTAMP column in
input-relation
that represents the timestamp variable. The timestep between consecutive entries should be consistent throughout thetimestamp-column
. input-relation
- Input relation containing
timeseries-column
andtimestamp-column
.
Parameters
model_name
- Name of a trained ARIMA model.
start
- The behavior of the
start
parameter and its range of accepted values depends on whether you provide atimeseries-column
:- No provided
timeseries-column
:start
must be an integer ≥0, where zero indicates to start prediction at the end of the in-sample data. Ifstart
is a positive value, the function predicts the values between the end of the in-sample data and thestart
index, and then uses the predicted values as time series inputs for the subsequentnpredictions
. timeseries-column
provided:start
must be an integer ≥1 and identifies the index (row) of thetimeseries-column
at which to begin prediction. If thestart
index is greater than the number of rows,N
, in the input data, the function predicts the values betweenN
andstart
and uses the predicted values as time series inputs for the subsequentnpredictions
.
Default:
-
No provided
timeseries-column
: prediction begins from the end of the in-sample data. -
timeseries-column
provided: prediction begins from the end of the provided input data.
- No provided
npredictions
- Integer ≥1, the number of predicted timesteps.
Default: 10
missing
- Methods for handling missing values, one of the following strings:
-
'drop': Missing values are ignored.
-
'error': Missing values raise an error.
-
'zero': Missing values are replaced with 0.
-
'linear_interpolation': Missing values are replaced by linearly-interpolated values based on the nearest valid entries before and after the missing value. If all values before or after a missing value in the prediction range are missing or invalid, interpolation is impossible and the function errors.
Default: Method used when training the model
-
add_mean
- Boolean, whether to add the model mean to the predicted value.
Default: True
output_standard_errors
- Boolean, whether to return estimates of the standard error of each prediction.
Default: False
Examples
The following example makes predictions using the in-sample data that the arima_temp
model was trained on:
=> SELECT PREDICT_ARIMA(USING PARAMETERS model_name='arima_temp', npredictions=10) OVER();
prediction
------------------
12.9797364979873
13.3768377212635
13.460660717892
13.468204126011
13.4572461558472
13.4418721036084
13.425515187182
13.4090117135945
13.3925648829068
13.3762235523779
(10 rows)
You can also apply the model to an input relation:
=> SELECT PREDICT_ARIMA(temperature USING PARAMETERS model_name='arima_temp', start=100, npredictions=10) OVER(ORDER BY time) FROM temp_data;
prediction
------------------
15.0373229398431
13.4709102391534
10.5720766977885
13.1971253722069
13.5615497506689
13.1613971089657
13.4008120147841
12.612020423044
12.9026197179173
13.2392824099367
(10 rows)
For an in-depth example that trains and makes predictions with an ARIMA model, see ARIMA model example.