This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
Random forest for regression
The Random Forest for regression algorithm creates an ensemble model of regression trees.
The Random Forest for regression algorithm creates an ensemble model of regression trees. Each tree is trained on a randomly selected subset of the training data. The algorithm predicts the value that is the mean prediction of the individual trees.
You can use the following functions to train the Random Forest model, and use the model to make predictions on a set of test data:
For a complete example of how to use the Random Forest for regression algorithm in Vertica, see Building a random forest regression model.
1 - Building a random forest regression model
This example uses the "mtcars" dataset to create a random forest model to predict the value of carb (the number of carburetors).
This example uses the "mtcars" dataset to create a random forest model to predict the value of carb
(the number of carburetors).
Before you begin the example,
load the Machine Learning sample data.
-
Use
RF_REGRESSOR
to create the random forest model myRFRegressorModel
using the mtcars
training data. View the summary output of the model with
GET_MODEL_SUMMARY
:
=> SELECT RF_REGRESSOR ('myRFRegressorModel', 'mtcars', 'carb', 'mpg, cyl, hp, drat, wt' USING PARAMETERS
ntree=100, sampling_size=0.3);
RF_REGRESSOR
--------------
Finished
(1 row)
=> SELECT GET_MODEL_SUMMARY(USING PARAMETERS model_name='myRFRegressorModel');
--------------------------------------------------------------------------------
===========
call_string
===========
SELECT rf_regressor('public.myRFRegressorModel', 'mtcars', '"carb"', 'mpg, cyl, hp, drat, wt'
USING PARAMETERS exclude_columns='', ntree=100, mtry=1, sampling_size=0.3, max_depth=5, max_breadth=32,
min_leaf_size=5, min_info_gain=0, nbins=32);
=======
details
=======
predictor|type
---------+-----
mpg |float
cyl | int
hp | int
drat |float
wt |float
===============
Additional Info
===============
Name |Value
------------------+-----
tree_count | 100
rejected_row_count| 0
accepted_row_count| 32
(1 row)
-
Use
PREDICT_RF_REGRESSOR
to predict the number of carburetors:
=> SELECT PREDICT_RF_REGRESSOR (mpg,cyl,hp,drat,wt
USING PARAMETERS model_name='myRFRegressorModel') FROM mtcars;
PREDICT_RF_REGRESSOR
----------------------
2.94774203574204
2.6954087024087
2.6954087024087
2.89906346431346
2.97688489288489
2.97688489288489
2.7086587024087
2.92078965478965
2.97688489288489
2.7086587024087
2.95621822621823
2.82255155955156
2.7086587024087
2.7086587024087
2.85650394050394
2.85650394050394
2.97688489288489
2.95621822621823
2.6954087024087
2.6954087024087
2.84493251193251
2.97688489288489
2.97688489288489
2.8856467976468
2.6954087024087
2.92078965478965
2.97688489288489
2.97688489288489
2.7934087024087
2.7934087024087
2.7086587024087
2.72469441669442
(32 rows)