SVM_REGRESSOR
Trains the SVM model on an input relation.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SVM_REGRESSOR ( 'model-name', input-relation, 'response-column', 'predictor-columns'
        [ USING PARAMETERS
              [exclude_columns = 'excluded-columns']
              [, error_tolerance = error-tolerance]
              [, C = cost]
              [, epsilon = epsilon-value]
              [, max_iterations = max-iterations]
              [, intercept_mode = 'mode']
              [, intercept_scaling = 'scale'] ] )
Arguments
- model-name
- Identifies the model to create, where model-nameconforms to conventions described in Identifiers. It must also be unique among all names of sequences, tables, projections, views, and models within the same schema.
- input-relation
- The table or view that contains the training data. If the input relation is defined in Hive, use 
SYNC_WITH_HCATALOG_SCHEMAto sync thehcatalogschema, and then run the machine learning function.
- response-column
- An input column that represents the dependent variable or outcome. The column must be a numeric data type.
- predictor-columns
- Comma-separated list of columns in the input relation that represent independent variables for the model, or asterisk (*) to select all columns. If you select all columns, the argument list for parameter - exclude_columnsmust include- response-column, and any columns that are invalid as predictor columns.- All predictor columns must be of type numeric or BOOLEAN; otherwise the model is invalid. - NoteAll BOOLEAN predictor values are converted to FLOAT values before training: 0 for false, 1 for true. No type checking occurs during prediction, so you can use a BOOLEAN predictor column in training, and during prediction provide a FLOAT column of the same name. In this case, all FLOAT values must be either 0 or 1.
Parameters
- exclude_columns
- Comma-separated list of columns from predictor-columnsto exclude from processing.
- error_tolerance
- Defines the acceptable error margin. Any data points outside this region add a penalty to the cost function.
Default: 0.1 
- C
- The weight for misclassification cost. The algorithm minimizes the regularization cost and the misclassification cost.
Default: 1.0 
- epsilon
- Used to control accuracy.
Default: 1e-3 
- max_iterations
- The maximum number of iterations that the algorithm performs.
Default: 100 
- intercept_mode
- A string that specifies how to treat the intercept, one of the following
- 
regularized(default): Fits the intercept and applies a regularization on it.
- 
unregularized: Fits the intercept but does not include it in regularization.
 
- 
- intercept_scaling
- A FLOAT value, serves as the value of a dummy feature whose coefficient Vertica uses to calculate the model intercept. Because the dummy feature is not in the training data, its values are set to a constant, by default set to 1.
Model attributes
- coeff
- Coefficients in the model:
- 
colNames: Intercept, or predictor column name
- 
coefficients: Coefficient value
 
- 
- nAccepted
- Number of samples accepted for training from the data set
- nRejected
- Number of samples rejected when training
- nIteration
- Number of iterations used in training
- callStr
- SQL statement used to replicate the training
Examples
=> SELECT SVM_REGRESSOR('mySvmRegModel', 'faithful', 'eruptions', 'waiting'
                          USING PARAMETERS error_tolerance=0.1, max_iterations=100);
SVM_REGRESSOR
----------------------------------------------------------------
Finished in 5 iterations.
Accepted Rows: 272  Rejected Rows: 0
(1 row)