MOVING_AVERAGE

Creates a moving-average (MA) model from a stationary time series with consistent timesteps that can then be used for prediction via PREDICT_MOVING_AVERAGE.

Moving average models use the errors of previous predictions to make future predictions. More specifically, the user-specified lag determines how many previous predictions and errors it takes into account during computation.

Since its input data must be sorted by timestamp, this algorithm is single-threaded.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

MOVING_AVERAGE ('model-name', 'input-relation', 'data-column', 'timestamp-column'
        [ USING PARAMETERS
              [ q = lags ]
              [, missing = "imputation-method" ]
              [, regularization = "regularization-method" ]
              [, lambda = regularization-value ]
              [, compute_mse = boolean ]
        ] )

Arguments

model-name: Identifies the model to create, where model-name conforms to conventions described in Identifiers. It must also be unique among all names of sequences, tables, projections, views, and models within the same schema.
input-relation: The table or view containing the timestamp-column.
This algorithm expects a stationary time series as input; using a time series with a mean that shifts over time may lead to weaker results.
data-column: An input column of type NUMERIC that contains the dependent variables or outcomes.
timestamp-column: One INTEGER, FLOAT, or TIMESTAMP column that represent the timestamp variable. Timesteps must be consistent.

Parameters

q

INTEGER in the range [1, 67), the number of lags to consider in the computation.

Note

The MOVING_AVERAGE and ARIMA models use different training techniques that produce distinct models when trained with matching parameter values on the same data. For example, if you train a moving-average model using the same data and q value as an ARIMA model trained with p and d parameters set to zero, those two models will not be identical.

Default: 1

missing

One of the following methods for handling missing values:

drop: Missing values are ignored.
error: Missing values raise an error.
zero: Missing values are replaced with 0.
linear_interpolation: Missing values are replaced by linearly interpolated values based on the nearest valid entries before and after the missing value. This means that in cases where the first or last values in a dataset are missing, they will simply be dropped.

Default: linear_interpolation

regularization

One of the following regularization methods used when fitting the data:

None
L2: weight regularization term which penalizes the squared weight value

Default: None

lambda

FLOAT in the range [0, 100000], the regularization value, lambda.

Default: 1.0

compute_mse

BOOLEAN, whether to calculate and output the mean squared error (MSE).

This parameter only accepts "true" or "false" rather than the standard literal equivalents for BOOLEANs like 1 or 0.

Default: False

Examples

See Moving-average model example.