ERROR_RATE
Using an input table, returns a table that calculates the rate of incorrect classifications and displays them as FLOAT values. ERROR_RATE
returns a table with the following dimensions:
-
Rows: Number of classes plus one row that contains the total error rate across classes
-
Columns: 2
Syntax
ERROR_RATE ( targets, predictions [ USING PARAMETERS num_classes = num-classes ] ) OVER()
Arguments
targets
- An input column that contains the true values of the response variable.
predictions
- An input column that contains the predicted class labels.
Arguments targets
and predictions
must be set to input columns of the same data type, one of the following: INTEGER, BOOLEAN, or CHAR/VARCHAR. Depending on their data type, these columns identify classes as follows:
-
INTEGER: Zero-based consecutive integers between 0 and (
num-classes
-1) inclusive, wherenum-classes
is the number of classes. For example, given the following input column values—{0, 1, 2, 3, 4
}—Vertica assumes five classes.Note
If input column values are not consecutive, Vertica interpolates the missing values. Thus, given the following input values—{0, 1, 3, 5, 6,}
— Vertica assumes seven classes. -
BOOLEAN: Yes or No
-
CHAR/VARCHAR: Class names. If the input columns are of type CHAR/VARCHAR columns, you must also set parameter
num_classes
to the number of classes.Note
Vertica computes the number of classes as the union of values in both input columns. For example, given the following sets of values in the
targets
andpredictions
input columns, Vertica counts four classes:{'milk', 'soy milk', 'cream'} {'soy milk', 'almond milk'}
Parameters
num_classes
An integer > 1, specifies the number of classes to pass to the function.
You must set this parameter if the specified input columns are of type CHAR/VARCHAR. Otherwise, the function processes this parameter according to the column data types:
-
INTEGER: By default set to 2, you must set this parameter correctly if the number of classes is any other value.
-
BOOLEAN: By default set to 2, cannot be set to any other value.
-
Privileges
Non-superusers: model owner, or USAGE privileges on the model
Examples
This example shows how to execute the ERROR_RATE function on an input table named mtcars
. The response variables appear in the column obs
, while the prediction variables appear in the column pred
. Because this example is a classification problem, all response variable values and prediction variable values are either 0 or 1, indicating binary classification.
In the table returned by the function, the first column displays the class id column. The second column displays the corresponding error rate for the class id. The third column indicates how many rows were successfully used by the function and whether any rows were ignored.
=> SELECT ERROR_RATE(obs::int, pred::int USING PARAMETERS num_classes=2) OVER()
FROM (SELECT am AS obs, PREDICT_LOGISTIC_REG (mpg, cyl, disp, drat, wt, qsec, vs, gear, carb
USING PARAMETERS model_name='myLogisticRegModel', type='response') AS pred
FROM mtcars) AS prediction_output;
class | error_rate | comment
-------+--------------------+---------------------------------------------
0 | 0 |
1 | 0.0769230797886848 |
| 0.03125 | Of 32 rows, 32 were used and 0 were ignored
(3 rows)