SVM_CLASSIFIER

Trains the SVM model on an input relation.

Trains the SVM model on an input relation.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SVM_CLASSIFIER ( 'model-name', input-relation, 'response-column', 'predictor-columns'
        [ USING PARAMETERS
              [exclude_columns = 'excluded-columns']
              [, C = 'cost']
              [, epsilon = 'epsilon-value']
              [, max_iterations = 'max-iterations']
              [, class_weights = 'weight']
              [, intercept_mode = 'intercept-mode']
              [, intercept_scaling = 'scale'] ] )

Arguments

model-name
Identifies the model to create, where model-name conforms to conventions described in Identifiers. It must also be unique among all names of sequences, tables, projections, views, and models within the same schema.
input-relation
The table or view that contains the training data. If the input relation is defined in Hive, use SYNC_WITH_HCATALOG_SCHEMA to sync the hcatalog schema, and then run the machine learning function.
response-column
The input column that represents the dependent variable or outcome. The column value must be 0 or 1, and of type numeric or BOOLEAN, otherwise the function returns with an error.
predictor-columns

Comma-separated list of columns in the input relation that represent independent variables for the model, or asterisk (*) to select all columns. If you select all columns, the argument list for parameter exclude_columns must include response-column, and any columns that are invalid as predictor columns.

All predictor columns must be of type numeric or BOOLEAN; otherwise the model is invalid.

Parameters

exclude_columns
Comma-separated list of columns from predictor-columns to exclude from processing.
C
Weight for misclassification cost. The algorithm minimizes the regularization cost and the misclassification cost.

Default: 1.0

epsilon
Used to control accuracy.

Default: 1e-3

max_iterations
Maximum number of iterations that the algorithm performs.

Default: 100

class_weights
Specifies how to determine weights of the two classes, one of the following:
  • None (default): No weights are used

  • value0, value1: Two comma-delimited strings that specify two positive FLOAT values, where value0 assigns a weight to class 0, and value1 assigns a weight to class 1.

  • auto: Weights each class according to the number of samples.

intercept_mode
Specifies how to treat the intercept, one of the following:
  • regularized (default): Fits the intercept and applies a regularization on it.

  • unregularized: Fits the intercept but does not include it in regularization.

intercept_scaling
Float value that serves as the value of a dummy feature whose coefficient Vertica uses to calculate the model intercept. Because the dummy feature is not in the training data, its values are set to a constant, by default 1.

Model attributes

coeff
Coefficients in the model:
  • colNames: Intercept, or predictor column name

  • coefficients: Coefficient value

nAccepted
Number of samples accepted for training from the data set
nRejected
Number of samples rejected when training
nIteration
Number of iterations used in training
callStr
SQL statement used to replicate the training

Examples

The following example uses SVM_CLASSIFIER on the mtcars table:


=> SELECT SVM_CLASSIFIER(
       'mySvmClassModel', 'mtcars', 'am', 'mpg,cyl,disp,hp,drat,wt,qsec,vs,gear,carb'
       USING PARAMETERS exclude_columns = 'hp,drat');
SVM_CLASSIFIER
----------------------------------------------------------------
Finished in 15 iterations.
Accepted Rows: 32  Rejected Rows: 0
(1 row)

See also