Machine learning

New features related to machine learning.

Partial least squares (PLS) regression support

Vertica now supports PLS regression models.

Combining aspects of PCA (principal component analysis) and linear regression, the PLS regression algorithm extracts a set of latent components that explain as much covariance as possible between the predictor and response variables, and then performs a regression that predicts response values using the extracted components.

This technique is particularly useful when the number of predictor variables is greater than the number of observations or the predictor variables are highly collinear. If either of these conditions is true of the input relation, ordinary linear regression fails to converge to an accurate model.

The PLS_REG function creates and trains a PLS model, and the PREDICT_PLS_REG function makes predictions on an input relation using a PLS model. For an in-depth example, see PLS regression.

Vector autoregression (VAR) support

Vertica now supports VAR models.

VAR is a multivariate autoregressive time series algorithm that captures the relationship between multiple time series variables over time. Unlike AR, which only considers a single variable, VAR models incorporate feedback between different variables in the model, enabling the model to analyze how variables interact across lagged time steps. For example, with two variables—atmospheric pressure and rain accumulation—a VAR model could determine whether a drop in pressure tends to result in rain at a future date.

The AUTOREGRESSOR function automatically executes the algorithm that fits your input data:

  • One value column: the function executes autoregression and returns a trained AR model.
  • Multiple value columns: the function executes vector autoregression and returns a trained VAR model.

To make predictions with a VAR model, use the PREDICT_AUTOREGRESSOR function. See VAR model example for an extended example.