This is the multipage printable view of this section.
Click here to print.
Return to the regular view of this page.
SVM (support vector machine) for classification
Support Vector Machine (SVM) is a classification algorithm that assigns data to one category or the other based on the training data.
Support Vector Machine (SVM) is a classification algorithm that assigns data to one category or the other based on the training data. This algorithm implements linear SVM, which is highly scalable.
You can use the following functions to train the SVM model, and use the model to make predictions on a set of test data:
You can also use the following evaluation functions to gain further insights:
For a complete example of how to use the SVM algorithm in Vertica, see Classifying data using SVM (support vector machine).
The implementation of the SVM algorithm in Vertica is based on the paper Distributed Newton Methods for Regularized Logistic Regression.
1  Classifying data using SVM (support vector machine)
This SVM example uses a small data set named mtcars.
This SVM example uses a small data set named mtcars. The example shows how you can use the SVM_CLASSIFIER function to train the model to predict the value of am
(the transmission type, where 0 = automatic and 1 = manual) using the PREDICT_SVM_CLASSIFIER function.
Before you begin the example,
load the Machine Learning sample data.

Create the SVM model, named svm_class
, using the mtcars_train
training data.
=> SELECT SVM_CLASSIFIER('svm_class', 'mtcars_train', 'am', 'cyl, mpg, wt, hp, gear'
USING PARAMETERS exclude_columns='gear');
SVM_CLASSIFIER

Finished in 12 iterations.
Accepted Rows: 20 Rejected Rows: 0
(1 row)

View the summary output of
`svm_class`
.
=> SELECT GET_MODEL_SUMMARY(USING PARAMETERS model_name='svm_class');

=======
details
=======
predictorcoefficient
+
Intercept 0.02006
cyl  0.15367
mpg  0.15698
wt  1.78157
hp  0.00957
===========
call_string
===========
SELECT svm_classifier('public.svm_class', 'mtcars_train', '"am"', 'cyl, mpg, wt, hp, gear'
USING PARAMETERS exclude_columns='gear', C=1, max_iterations=100, epsilon=0.001);
===============
Additional Info
===============
Name Value
+
accepted_row_count 20
rejected_row_count 0
iteration_count  12
(1 row)

Create a new table, named svm_mtcars_predict
. Populate this table with the prediction outputs you obtain from running the PREDICT_SVM_CLASSIFIER
function on your test data.
=> CREATE TABLE svm_mtcars_predict AS
(SELECT car_model, am, PREDICT_SVM_CLASSIFIER(cyl, mpg, wt, hp
USING PARAMETERS model_name='svm_class')
AS Prediction FROM mtcars_test);
CREATE TABLE

View the results in the svm_mtcars_predict
table.
=> SELECT * FROM svm_mtcars_predict;
car_model  am  Prediction
 ++
Toyota Corona  0  1
Camaro Z28  0  0
Datsun 710  1  1
Valiant  0  0
Volvo 142E  1  1
AMC Javelin  0  0
Honda Civic  1  1
Hornet 4 Drive 0  0
Maserati Bora  1  1
Merc 280  0  0
Merc 450SL  0  0
Porsche 9142  1  1
(12 rows)

Evaluate the accuracy of the PREDICT_SVM_CLASSIFIER
function, using the CONFUSION_MATRIX evaluation function.
=> SELECT CONFUSION_MATRIX(obs::int, pred::int USING PARAMETERS num_classes=2) OVER()
FROM (SELECT am AS obs, Prediction AS pred FROM svm_mtcars_predict) AS prediction_output;
class  0  1  comment
+++
0  6  1 
1  0  5  Of 12 rows, 12 were used and 0 were ignored
(2 rows)
In this case, PREDICT_SVM_CLASSIFIER
correctly predicted that the cars with a value of 1
in the am
column have a value of 1
. No cars were incorrectly classified. Out of the seven cars which had a value of 0
in the am
column, six were correctly predicted to have the value 0
. One car was incorrectly classified as having the value 1
.
See also