<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>OpenText Analytics Database 26.2.x – Model evaluation</title>
    <link>/en/sql-reference/functions/ml-functions/model-evaluation/</link>
    <description>Recent content in Model evaluation on OpenText Analytics Database 26.2.x</description>
    <generator>Hugo -- gohugo.io</generator>
    
	  <atom:link href="/en/sql-reference/functions/ml-functions/model-evaluation/index.xml" rel="self" type="application/rss+xml" />
    
    
      
        
      
    
    
    <item>
      <title>Sql-Reference: CONFUSION_MATRIX</title>
      <link>/en/sql-reference/functions/ml-functions/model-evaluation/confusion-matrix/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/sql-reference/functions/ml-functions/model-evaluation/confusion-matrix/</guid>
      <description>
        
        
        &lt;p&gt;Computes the confusion matrix of a table with observed and predicted values of a response variable. &lt;code&gt;CONFUSION_MATRIX&lt;/code&gt; produces a table with the following dimensions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Rows: Number of classes&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Columns: Number of classes + 2&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;syntax&#34;&gt;Syntax&lt;/h2&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;CONFUSION_MATRIX ( &lt;span class=&#34;code-variable&#34;&gt;targets&lt;/span&gt;, &lt;span class=&#34;code-variable&#34;&gt;predictions&lt;/span&gt; [ USING PARAMETERS num_classes = &lt;span class=&#34;code-variable&#34;&gt;num-classes&lt;/span&gt; ] OVER()
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;arguments&#34;&gt;Arguments&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;em&gt;&lt;code&gt;targets&lt;/code&gt;&lt;/em&gt;&lt;/dt&gt;
&lt;dd&gt;An input column that contains the true values of the response variable.&lt;/dd&gt;
&lt;dt&gt;&lt;em&gt;&lt;code&gt;predictions&lt;/code&gt;&lt;/em&gt;&lt;/dt&gt;
&lt;dd&gt;An input column that contains the predicted class labels.&lt;/dd&gt;
&lt;/dl&gt;
&lt;p&gt;Arguments &lt;em&gt;&lt;code&gt;targets&lt;/code&gt;&lt;/em&gt; and &lt;em&gt;&lt;code&gt;predictions&lt;/code&gt;&lt;/em&gt; must be set to input columns of the same data type, one of the following: INTEGER, BOOLEAN, or CHAR/VARCHAR. Depending on their data type, these columns identify classes as follows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;INTEGER: Zero-based consecutive integers between 0 and (&lt;em&gt;&lt;code&gt;num-classes&lt;/code&gt;&lt;/em&gt;-1) inclusive, where &lt;em&gt;&lt;code&gt;num-classes&lt;/code&gt;&lt;/em&gt; is the number of classes. For example, given the following input column values— &lt;code&gt;{0, 1, 2, 3, 4&lt;/code&gt;}—Vertica assumes five classes.&lt;/p&gt;

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

If input column values are not consecutive, Vertica interpolates the missing values. Thus, given the following input values— &lt;code&gt;{0, 1, 3, 5, 6,}&lt;/code&gt;— Vertica assumes seven classes.

&lt;/div&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;BOOLEAN: Yes or No&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;CHAR/VARCHAR: Class names. If the input columns are of type CHAR/VARCHAR columns, you must also set parameter &lt;code&gt;num_classes&lt;/code&gt; to the number of classes.&lt;/p&gt;

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

&lt;p&gt;Vertica computes the number of classes as the union of values in both input columns. For example, given the following sets of values in the &lt;em&gt;&lt;code&gt;targets&lt;/code&gt;&lt;/em&gt; and &lt;em&gt;&lt;code&gt;predictions&lt;/code&gt;&lt;/em&gt; input columns, Vertica counts four classes:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;{&amp;#39;milk&amp;#39;, &amp;#39;soy milk&amp;#39;, &amp;#39;cream&amp;#39;}
{&amp;#39;soy milk&amp;#39;, &amp;#39;almond milk&amp;#39;}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&#34;parameters&#34;&gt;Parameters&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;code&gt;num_classes&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;&lt;p&gt;An integer &amp;gt; 1, specifies the number of classes to pass to the function.&lt;/p&gt;
&lt;p&gt;You must set this parameter if the specified input columns are of type CHAR/VARCHAR. Otherwise, the function processes this parameter according to the column data types:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;INTEGER: By default set to 2, you must set this parameter correctly if the number of classes is any other value.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;BOOLEAN: By default set to 2, cannot be set to any other value.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/dd&gt;
&lt;/dl&gt;
&lt;h2 id=&#34;examples&#34;&gt;Examples&lt;/h2&gt;
&lt;p&gt;This example computes the confusion matrix for a logistic regression model that classifies cars in the &lt;code&gt;mtcars&lt;/code&gt; data set as automatic or manual transmission. Observed values are in input column &lt;code&gt;obs&lt;/code&gt;, while predicted values are in input column &lt;code&gt;pred&lt;/code&gt;. Because this is a binary classification problem, all values are either 0 or 1.&lt;/p&gt;
&lt;p&gt;In the table returned, all 19 cars with a value of 0 in column &lt;code&gt;am&lt;/code&gt; are correctly predicted by &lt;code&gt;PREDICT_LOGISTIC_REGRESSION&lt;/code&gt; as having a value of 0. Of the 13 cars with a value of 1 in column &lt;code&gt;am&lt;/code&gt;, 12 are correctly predicted to have a value of 1, while 1 car is incorrectly classified as having a value of 0:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT CONFUSION_MATRIX(obs::int, pred::int USING PARAMETERS num_classes=2) OVER()
    FROM (SELECT am AS obs, PREDICT_LOGISTIC_REG(mpg, cyl, disp,drat, wt, qsec, vs, gear, carb
             USING PARAMETERS model_name=&amp;#39;myLogisticRegModel&amp;#39;)AS PRED
             FROM mtcars) AS prediction_output;

actual_class | predicted_0 | predicted_1 |        comment
-------------+-------------+-------------+------------------------------------------
0            |          19 |           0 |
1            |           0 |          13 | Of 32 rows, 32 were used and 0 were ignored
(2 rows)
&lt;/code&gt;&lt;/pre&gt;
      </description>
    </item>
    
    <item>
      <title>Sql-Reference: CROSS_VALIDATE</title>
      <link>/en/sql-reference/functions/ml-functions/model-evaluation/cross-validate/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/sql-reference/functions/ml-functions/model-evaluation/cross-validate/</guid>
      <description>
        
        
        &lt;p&gt;Performs k-fold cross validation on a learning algorithm using an input relation, and grid search for hyper parameters. The output is an average performance indicator of the selected algorithm. This function supports SVM classification, naive bayes, and logistic regression.&lt;/p&gt;
&lt;p&gt;This is a meta-function. You must call meta-functions in a top-level &lt;a href=&#34;../../../../../en/sql-reference/statements/select/#&#34;&gt;SELECT&lt;/a&gt; statement.&lt;/p&gt;

&lt;h2 id=&#34;behavior-type&#34;&gt;Behavior type&lt;/h2&gt;
&lt;a class=&#34;glosslink&#34; href=&#34;../../../../../en/glossary/volatile-functions/&#34; title=&#34;&#34;&gt;Volatile&lt;/a&gt;
&lt;h2 id=&#34;syntax&#34;&gt;Syntax&lt;/h2&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;CROSS_VALIDATE ( &amp;#39;&lt;span class=&#34;code-variable&#34;&gt;algorithm&lt;/span&gt;&amp;#39;, &amp;#39;&lt;span class=&#34;code-variable&#34;&gt;input-relation&lt;/span&gt;&amp;#39;, &amp;#39;&lt;span class=&#34;code-variable&#34;&gt;response-column&lt;/span&gt;&amp;#39;, &amp;#39;&lt;span class=&#34;code-variable&#34;&gt;predictor-columns&lt;/span&gt;&amp;#39;
        [ USING PARAMETERS
              [exclude_columns = &amp;#39;&lt;span class=&#34;code-variable&#34;&gt;excluded-columns&lt;/span&gt;&amp;#39;]
           [, cv_model_name = &amp;#39;&lt;span class=&#34;code-variable&#34;&gt;model&lt;/span&gt;&amp;#39;]
           [, cv_metrics = &amp;#39;&lt;span class=&#34;code-variable&#34;&gt;metrics&lt;/span&gt;&amp;#39;]
           [, cv_fold_count = &lt;span class=&#34;code-variable&#34;&gt;num-folds&lt;/span&gt;]
           [, cv_hyperparams = &amp;#39;&lt;span class=&#34;code-variable&#34;&gt;hyperparams&lt;/span&gt;&amp;#39;]
           [, cv_prediction_cutoff = &lt;span class=&#34;code-variable&#34;&gt;prediction-cutoff&lt;/span&gt;] ] )
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;arguments&#34;&gt;Arguments&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;em&gt;&lt;code&gt;algorithm&lt;/code&gt;&lt;/em&gt;&lt;/dt&gt;
&lt;dd&gt;Name of the algorithm training function, one of the following:
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/ml-algorithms/linear-reg/#&#34;&gt;LINEAR_REG&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/ml-algorithms/logistic-reg/#&#34;&gt;LOGISTIC_REG&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/ml-algorithms/naive-bayes/#&#34;&gt;NAIVE_BAYES&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/ml-algorithms/svm-classifier/#&#34;&gt;SVM_CLASSIFIER&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/ml-algorithms/svm-regressor/#&#34;&gt;SVM_REGRESSOR&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/dd&gt;
&lt;dt&gt;&lt;em&gt;&lt;code&gt;input-relation&lt;/code&gt;&lt;/em&gt;&lt;/dt&gt;
&lt;dd&gt;The table or view that contains data used for training and testing. If the input relation is defined in Hive, use 
&lt;code&gt;&lt;a href=&#34;../../../../../en/sql-reference/functions/hadoop-functions/sync-with-hcatalog-schema/#&#34;&gt;SYNC_WITH_HCATALOG_SCHEMA&lt;/a&gt;&lt;/code&gt; to sync the &lt;code&gt;hcatalog&lt;/code&gt; schema, and then run the machine learning function.&lt;/dd&gt;
&lt;dt&gt;&lt;em&gt;&lt;code&gt;response-column&lt;/code&gt;&lt;/em&gt;&lt;/dt&gt;
&lt;dd&gt;Name of the input column that contains the response.&lt;/dd&gt;
&lt;dt&gt;&lt;em&gt;&lt;code&gt;predictor-columns&lt;/code&gt;&lt;/em&gt;&lt;/dt&gt;
&lt;dd&gt;&lt;p&gt;Comma-separated list of columns in the input relation that represent independent variables for the model, or asterisk (*) to select all columns. If you select all columns, the argument list for parameter &lt;code&gt;exclude_columns&lt;/code&gt; must include &lt;em&gt;&lt;code&gt;response-column&lt;/code&gt;&lt;/em&gt;, and any columns that are invalid as predictor columns.&lt;/p&gt;
&lt;/dd&gt;
&lt;/dl&gt;
&lt;h2 id=&#34;parameters&#34;&gt;Parameters&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;code&gt;exclude_columns&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;Comma-separated list of columns from &lt;em&gt;&lt;code&gt;predictor-columns&lt;/code&gt;&lt;/em&gt; to exclude from processing.&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;cv_model_name&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;The name of a model that lets you retrieve results of the cross validation process. If you omit this parameter, results are displayed but not saved. If you set this parameter to a model name, you can retrieve the results with summary functions 
&lt;code&gt;&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/model-management/get-model-attribute/#&#34;&gt;GET_MODEL_ATTRIBUTE&lt;/a&gt;&lt;/code&gt; and 
&lt;code&gt;&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/model-management/get-model-summary/#&#34;&gt;GET_MODEL_SUMMARY&lt;/a&gt;&lt;/code&gt;&lt;/dd&gt;
&lt;dt&gt;
&lt;code&gt;&lt;a name=&#34;metricNames&#34;&gt;&lt;/a&gt;cv_metrics&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;The metrics used to assess the algorithm, specified either as a comma-separated list of metric names or in a &lt;a href=&#34;#Specifyi&#34;&gt;JSON array&lt;/a&gt;. In both cases, you specify one or more of the following metric names:
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;accuracy&lt;/code&gt; (default)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;error_rate&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;TP&lt;/code&gt;: True positive, the number of cases of class 1 predicted as class 1&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;FP&lt;/code&gt;: False positive, the number of cases of class 0 predicted as class 1&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;TN&lt;/code&gt;: True negative, the number of cases of class 0 predicted as class 0&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;FN&lt;/code&gt;: False negative, the number of cases of class 1 predicted as class 0&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;TPR&lt;/code&gt; or &lt;code&gt;recall&lt;/code&gt;: True positive rate, the correct predictions among class 1&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;FPR&lt;/code&gt;: False positive rate, the wrong predictions among class 0&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;TNR&lt;/code&gt;: True negative rate, the correct predictions among class 0&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;FNR&lt;/code&gt;: False negative rate, the wrong predictions among class 1&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;PPV&lt;/code&gt; or &lt;code&gt;precision&lt;/code&gt;: The positive predictive value, the correct predictions among cases predicted as class 1&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;NPV&lt;/code&gt;: Negative predictive value, the correct predictions among cases predicted as class 0&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;MSE&lt;/code&gt;: Mean squared error&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;../../../../../images/machine-learning/mean-squared-error.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;MAE&lt;/code&gt;: Mean absolute error&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;../../../../../images/machine-learning/mean-absolute-error.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;rsquared&lt;/code&gt;: coefficient of determination&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;../../../../../images/machine-learning/rsquared.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;explained_variance&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;../../../../../images/machine-learning/explained-variance.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;fscore&lt;/code&gt;&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;(1 + beta˄2) * precison * recall / (beta˄2 * precision + recall)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;beta equals 1 by default&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;auc_roc&lt;/code&gt;: AUC of ROC using the specified number of bins, by default 100&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;auc_prc&lt;/code&gt;: AUC of PRC using the specified number of bins, by default 100&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;counts&lt;/code&gt;: Shortcut that resolves to four other metrics: &lt;code&gt;TP&lt;/code&gt;, &lt;code&gt;FP&lt;/code&gt;, &lt;code&gt;TN&lt;/code&gt;, and &lt;code&gt;FN&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;count&lt;/code&gt;: Valid only in JSON syntax, counts the number of cases labeled by one class (&lt;em&gt;&lt;code&gt;case-class-label&lt;/code&gt;&lt;/em&gt;) but predicted as another class (&lt;em&gt;&lt;code&gt;predicted-class-label&lt;/code&gt;&lt;/em&gt;):&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;cv_metrics=&amp;#39;[{&amp;#34;count&amp;#34;:[&lt;span class=&#34;code-variable&#34;&gt;case-class-label&lt;/span&gt;, &lt;span class=&#34;code-variable&#34;&gt;predicted-class-label&lt;/span&gt;]}]&amp;#39;
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;cv_fold_count&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;The number of folds to split the data.
&lt;p&gt;&lt;strong&gt;Default:&lt;/strong&gt; 5&lt;/p&gt;
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;cv_hyperparams&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;A JSON string that describes the combination of parameters for use in grid search of hyper parameters. The JSON string contains pairs of the hyper parameter name. The value of each hyper parameter can be specified as an array or sequence. For example:
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;{&amp;#34;param1&amp;#34;:[value1,value2,...], &amp;#34;param2&amp;#34;:{&amp;#34;first&amp;#34;:first_value, &amp;#34;step&amp;#34;:step_size, &amp;#34;count&amp;#34;:number_of_values} }
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Hyper parameter names and string values should be quoted using the JSON standard. These parameters are passed to the training function.&lt;/p&gt;
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;cv_prediction_cutoff&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;The cutoff threshold that is passed to the prediction stage of logistic regression, a FLOAT between 0 and 1, exclusive
&lt;p&gt;&lt;strong&gt;Default:&lt;/strong&gt; 0.5&lt;/p&gt;
&lt;/dd&gt;
&lt;/dl&gt;
&lt;p&gt;&lt;a name=&#34;ModelAttributes&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;model-attributes&#34;&gt;Model attributes&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;code&gt;call_string&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;The value of all input arguments that were specified at the time &lt;code&gt;CROSS_VALIDATE&lt;/code&gt; was called.&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;run_average&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;The average across all folds of all metrics specified in parameter &lt;code&gt;cv_metrics&lt;/code&gt;, if specified; otherwise, average accuracy.&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;fold_info&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;The number of rows in each fold:
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;fold_id&lt;/code&gt;: The index of the fold.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;row_count&lt;/code&gt;: The number of rows held out for testing in the fold.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;counters&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;All counters for the function, including:
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;accepted_row_count&lt;/code&gt;: The total number of rows in the &lt;code&gt;input_relation&lt;/code&gt;, minus the number of rejected rows.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;rejected_row_count&lt;/code&gt;: The number of rows of the &lt;code&gt;input_relation&lt;/code&gt; that were skipped because they contained an invalid value.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;feature_count&lt;/code&gt;: The number of features input to the machine learning model.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;run_details&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;Information about each run, where a run means training a single model, and then testing that model on the one held-out fold:
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;fold_id&lt;/code&gt;: The index of the fold held out for testing.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;iteration_count&lt;/code&gt;: The number of iterations used in model training on non-held-out folds.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;accuracy&lt;/code&gt;: All metrics specified in parameter &lt;code&gt;cv_metrics&lt;/code&gt;, or accuracy if &lt;code&gt;cv_metrics&lt;/code&gt; is not provided.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;error_rate&lt;/code&gt;: All metrics specified in parameter &lt;code&gt;cv_metrics&lt;/code&gt;, or accuracy if the parameter is omitted.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/dd&gt;
&lt;/dl&gt;
&lt;h2 id=&#34;privileges&#34;&gt;Privileges&lt;/h2&gt;
&lt;p&gt;Non-superusers:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;SELECT privileges on the input relation&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;CREATE and USAGE privileges on the default schema where machine learning algorithms generate models. If &lt;code&gt;cv_model_name&lt;/code&gt; is provided, the cross validation results are saved as a model in the same schema.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a name=&#34;Specifyi&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;specifying-metrics-in-json&#34;&gt;Specifying metrics in JSON&lt;/h2&gt;
&lt;p&gt;Parameter &lt;code&gt;cv_metrics&lt;/code&gt; can specify metrics as an array of &lt;a href=&#34;https://www.w3schools.com/js/js_json_arrays.asp&#34;&gt;JSON objects&lt;/a&gt;, where each object specifies a metric name . For example, the following expression sets &lt;code&gt;cv_metrics&lt;/code&gt; to two metrics specified as JSON objects, &lt;code&gt;accuracy&lt;/code&gt; and &lt;code&gt;error_rate&lt;/code&gt;:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;cv_metrics=&amp;#39;[&amp;#34;accuracy&amp;#34;, &amp;#34;error_rate&amp;#34;]&amp;#39;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;In the next example, &lt;code&gt;cv_metrics&lt;/code&gt; is set to two metrics, &lt;code&gt;accuracy&lt;/code&gt; and &lt;code&gt;TPR&lt;/code&gt; (true positive rate). Here, the &lt;code&gt;TPR&lt;/code&gt; metric is specified as a JSON object that takes an array of two class label arguments, 2 and 3:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;cv_metrics=&amp;#39;[ &amp;#34;accuracy&amp;#34;, {&amp;#34;TPR&amp;#34;:[2,3] } ]&amp;#39;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Metrics specified as JSON objects can accept parameters. In the following example, the &lt;code&gt;fscore&lt;/code&gt; metric specifies parameter &lt;code&gt;beta&lt;/code&gt;, which is set to 0.5:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;cv_metrics=&amp;#39;[ {&amp;#34;fscore&amp;#34;:{&amp;#34;beta&amp;#34;:0.5} } ]&amp;#39;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Parameter support can be especially useful for certain metrics. For example, metrics &lt;code&gt;auc_roc&lt;/code&gt; and &lt;code&gt;auc_prc&lt;/code&gt; build a curve, and then compute the area under that curve. For &lt;code&gt;ROC&lt;/code&gt;, the curve is formed by plotting metrics &lt;code&gt;TPR&lt;/code&gt; against &lt;code&gt;FPR&lt;/code&gt;; for &lt;code&gt;PRC&lt;/code&gt;, &lt;code&gt;PPV&lt;/code&gt; (&lt;code&gt;precision&lt;/code&gt;) against &lt;code&gt;TPR&lt;/code&gt; (&lt;code&gt;recall&lt;/code&gt;). The accuracy of such curves can be increased by setting parameter &lt;code&gt;num_bins&lt;/code&gt; to a value greater than the default value of 100. For example, the following expression computes AUC for an ROC curve built with 1000 bins:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;cv_metrics=&amp;#39;[{&amp;#34;auc_roc&amp;#34;:{&amp;#34;num_bins&amp;#34;:1000}}]&amp;#39;
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;using-metrics-with-multi-class-classifier-functions&#34;&gt;Using metrics with Multi-class classifier functions&lt;/h2&gt;
&lt;p&gt;All supported metrics are defined for binary classifier functions 
&lt;code&gt;&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/ml-algorithms/logistic-reg/#&#34;&gt;LOGISTIC_REG&lt;/a&gt;&lt;/code&gt; and 
&lt;code&gt;&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/ml-algorithms/svm-classifier/#&#34;&gt;SVM_CLASSIFIER&lt;/a&gt;&lt;/code&gt;. For multi-class classifier functions such as 
&lt;code&gt;&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/ml-algorithms/naive-bayes/#&#34;&gt;NAIVE_BAYES&lt;/a&gt;&lt;/code&gt;, these metrics can be calculated for each &lt;em&gt;one-versus-the-rest&lt;/em&gt; binary classifier. Use arguments to request the metrics for each classifier. For example, if training data has integer class labels, you can set &lt;code&gt;cv_metrics&lt;/code&gt; with the &lt;code&gt;precision&lt;/code&gt; (&lt;code&gt;PPV&lt;/code&gt;) metric as follows:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;cv_metrics=&amp;#39;[{&amp;#34;precision&amp;#34;:[0,4]}]&amp;#39;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This setting specifies to return two columns with precision computed for two classifiers:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Column 1: classifies 0 versus not 0&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Collumn 2: classifies 4 versus not 4&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you omit class label arguments, the class with index 1 is used. Instead of computing metrics for individual &lt;em&gt;&lt;code&gt;one-versus-the-rest&lt;/code&gt;&lt;/em&gt; classifiers, the average is computed in one of the following styles: &lt;code&gt;macro&lt;/code&gt;, &lt;code&gt;micro&lt;/code&gt;, or &lt;code&gt;weighted&lt;/code&gt; (default). For example, the following &lt;code&gt;cv_metrics&lt;/code&gt; setting returns the average weighted by class sizes:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;cv_metrics=&amp;#39;[{&amp;#34;precision&amp;#34;:{&amp;#34;avg&amp;#34;:&amp;#34;weighted&amp;#34;}}]&amp;#39;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;AUC-type metrics can be similarly defined for multi-class classifiers. For example, the following &lt;code&gt;cv_metrics&lt;/code&gt; setting computes the area under the ROC curve for each &lt;em&gt;&lt;code&gt;one-versus-the-rest&lt;/code&gt;&lt;/em&gt; classifier, and then returns the average weighted by class sizes.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;cv_metrics=&amp;#39;[{&amp;#34;auc_roc&amp;#34;:{&amp;#34;avg&amp;#34;:&amp;#34;weighted&amp;#34;, &amp;#34;num_bins&amp;#34;:1000}}]&amp;#39;
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;examples&#34;&gt;Examples&lt;/h2&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT CROSS_VALIDATE(&amp;#39;svm_classifier&amp;#39;, &amp;#39;mtcars&amp;#39;, &amp;#39;am&amp;#39;, &amp;#39;mpg&amp;#39;
      USING PARAMETERS cv_fold_count= 6,
                       cv_hyperparams=&amp;#39;{&amp;#34;C&amp;#34;:[1,5]}&amp;#39;,
                       cv_model_name=&amp;#39;cv_svm&amp;#39;,
                       cv_metrics=&amp;#39;accuracy, error_rate&amp;#39;);
         CROSS_VALIDATE
----------------------------
 Finished

===========
run_average
===========
C  |accuracy      |error_rate
---+--------------+----------
1 | 0.75556       |  0.24444
5 | 0.78333       |  0.21667
(1 row)
&lt;/code&gt;&lt;/pre&gt;
      </description>
    </item>
    
    <item>
      <title>Sql-Reference: ERROR_RATE</title>
      <link>/en/sql-reference/functions/ml-functions/model-evaluation/error-rate/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/sql-reference/functions/ml-functions/model-evaluation/error-rate/</guid>
      <description>
        
        
        &lt;p&gt;Using an input table, returns a table that calculates the rate of incorrect classifications and displays them as FLOAT values. &lt;code&gt;ERROR_RATE&lt;/code&gt; returns a table with the following dimensions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Rows: Number of classes plus one row that contains the total error rate across classes&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Columns: 2&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;syntax&#34;&gt;Syntax&lt;/h2&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;ERROR_RATE ( &lt;span class=&#34;code-variable&#34;&gt;targets&lt;/span&gt;, &lt;span class=&#34;code-variable&#34;&gt;predictions&lt;/span&gt; [ USING PARAMETERS num_classes = &lt;span class=&#34;code-variable&#34;&gt;num-classes&lt;/span&gt; ] ) OVER()
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;arguments&#34;&gt;Arguments&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;em&gt;&lt;code&gt;targets&lt;/code&gt;&lt;/em&gt;&lt;/dt&gt;
&lt;dd&gt;An input column that contains the true values of the response variable.&lt;/dd&gt;
&lt;dt&gt;&lt;em&gt;&lt;code&gt;predictions&lt;/code&gt;&lt;/em&gt;&lt;/dt&gt;
&lt;dd&gt;An input column that contains the predicted class labels.&lt;/dd&gt;
&lt;/dl&gt;
&lt;p&gt;Arguments &lt;em&gt;&lt;code&gt;targets&lt;/code&gt;&lt;/em&gt; and &lt;em&gt;&lt;code&gt;predictions&lt;/code&gt;&lt;/em&gt; must be set to input columns of the same data type, one of the following: INTEGER, BOOLEAN, or CHAR/VARCHAR. Depending on their data type, these columns identify classes as follows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;INTEGER: Zero-based consecutive integers between 0 and (&lt;em&gt;&lt;code&gt;num-classes&lt;/code&gt;&lt;/em&gt;-1) inclusive, where &lt;em&gt;&lt;code&gt;num-classes&lt;/code&gt;&lt;/em&gt; is the number of classes. For example, given the following input column values— &lt;code&gt;{0, 1, 2, 3, 4&lt;/code&gt;}—Vertica assumes five classes.&lt;/p&gt;

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

If input column values are not consecutive, Vertica interpolates the missing values. Thus, given the following input values— &lt;code&gt;{0, 1, 3, 5, 6,}&lt;/code&gt;— Vertica assumes seven classes.

&lt;/div&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;BOOLEAN: Yes or No&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;CHAR/VARCHAR: Class names. If the input columns are of type CHAR/VARCHAR columns, you must also set parameter &lt;code&gt;num_classes&lt;/code&gt; to the number of classes.&lt;/p&gt;

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

&lt;p&gt;Vertica computes the number of classes as the union of values in both input columns. For example, given the following sets of values in the &lt;em&gt;&lt;code&gt;targets&lt;/code&gt;&lt;/em&gt; and &lt;em&gt;&lt;code&gt;predictions&lt;/code&gt;&lt;/em&gt; input columns, Vertica counts four classes:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;{&amp;#39;milk&amp;#39;, &amp;#39;soy milk&amp;#39;, &amp;#39;cream&amp;#39;}
{&amp;#39;soy milk&amp;#39;, &amp;#39;almond milk&amp;#39;}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&#34;parameters&#34;&gt;Parameters&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;code&gt;num_classes&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;&lt;p&gt;An integer &amp;gt; 1, specifies the number of classes to pass to the function.&lt;/p&gt;
&lt;p&gt;You must set this parameter if the specified input columns are of type CHAR/VARCHAR. Otherwise, the function processes this parameter according to the column data types:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;INTEGER: By default set to 2, you must set this parameter correctly if the number of classes is any other value.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;BOOLEAN: By default set to 2, cannot be set to any other value.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/dd&gt;
&lt;/dl&gt;
&lt;h2 id=&#34;privileges&#34;&gt;Privileges&lt;/h2&gt;
&lt;p&gt;Non-superusers: model owner, or USAGE privileges on the model&lt;/p&gt;

&lt;h2 id=&#34;examples&#34;&gt;Examples&lt;/h2&gt;
&lt;p&gt;This example shows how to execute the ERROR_RATE function on an input table named &lt;code&gt;mtcars&lt;/code&gt;. The response variables appear in the column &lt;code&gt;obs&lt;/code&gt;, while the prediction variables appear in the column &lt;code&gt;pred&lt;/code&gt;. Because this example is a classification problem, all response variable values and prediction variable values are either 0 or 1, indicating binary classification.&lt;/p&gt;
&lt;p&gt;In the table returned by the function, the first column displays the class id column. The second column displays the corresponding error rate for the class id. The third column indicates how many rows were successfully used by the function and whether any rows were ignored.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT ERROR_RATE(obs::int, pred::int USING PARAMETERS num_classes=2) OVER()
    FROM (SELECT am AS obs, PREDICT_LOGISTIC_REG (mpg, cyl, disp, drat, wt, qsec, vs, gear, carb
                USING PARAMETERS model_name=&amp;#39;myLogisticRegModel&amp;#39;, type=&amp;#39;response&amp;#39;) AS pred
             FROM mtcars) AS prediction_output;
 class |     error_rate     |                   comment
-------+--------------------+---------------------------------------------
     0 |                  0 |
     1 | 0.0769230797886848 |
       |            0.03125 | Of 32 rows, 32 were used and 0 were ignored
(3 rows)
&lt;/code&gt;&lt;/pre&gt;
      </description>
    </item>
    
    <item>
      <title>Sql-Reference: LIFT_TABLE</title>
      <link>/en/sql-reference/functions/ml-functions/model-evaluation/lift-table/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/sql-reference/functions/ml-functions/model-evaluation/lift-table/</guid>
      <description>
        
        
        &lt;p&gt;Returns a table that compares the predictive quality of a machine learning model. This function is also known as a &lt;em&gt;&lt;code&gt;lift chart&lt;/code&gt;&lt;/em&gt;.&lt;/p&gt;
&lt;h2 id=&#34;syntax&#34;&gt;Syntax&lt;/h2&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;LIFT_TABLE ( &lt;span class=&#34;code-variable&#34;&gt;targets&lt;/span&gt;, &lt;span class=&#34;code-variable&#34;&gt;probabilities&lt;/span&gt;
        [ USING PARAMETERS [num_bins = &lt;span class=&#34;code-variable&#34;&gt;num-bins&lt;/span&gt;] [, main_class = &lt;span class=&#34;code-variable&#34;&gt;class-name&lt;/span&gt; ] ] )
OVER()
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;arguments&#34;&gt;Arguments&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;em&gt;&lt;code&gt;targets&lt;/code&gt;&lt;/em&gt;&lt;/dt&gt;
&lt;dd&gt;An input column that contains the true values of the response variable, one of the following data types: INTEGER, BOOLEAN, or CHAR/VARCHAR. Depending on the column data type, the function processes column data as follows:
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;INTEGER: Uses the input column as containing the true value of the response variable.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;BOOLEAN: Resolves Yes to 1, 0 to No.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;CHAR/VARCHAR: Resolves the value specified by parameter &lt;code&gt;main_class&lt;/code&gt; to 1, all other values to 0.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

If the input column is of data type INTEGER or BOOLEAN, the function ignores parameter &lt;code&gt;main_class&lt;/code&gt;.

&lt;/div&gt;
&lt;/dd&gt;
&lt;dt&gt;&lt;em&gt;&lt;code&gt;probabilities&lt;/code&gt;&lt;/em&gt;&lt;/dt&gt;
&lt;dd&gt;A FLOAT input column that contains the predicted probability of response being the main class, set to 1 if &lt;em&gt;&lt;code&gt;targets&lt;/code&gt;&lt;/em&gt; is of type INTEGER.&lt;/dd&gt;
&lt;/dl&gt;

&lt;h2 id=&#34;parameters&#34;&gt;Parameters&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;code&gt;num_bins&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;&lt;p&gt;An integer value that determines the number of decision boundaries. Decision boundaries are set at equally spaced intervals between 0 and 1, inclusive. The function computes the table at each &lt;em&gt;&lt;code&gt;num-bin&lt;/code&gt;&lt;/em&gt; + 1 point.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Default&lt;/strong&gt;: 100&lt;/p&gt;
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;main_class&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;&lt;p&gt;Used only if &lt;em&gt;&lt;code&gt;targets&lt;/code&gt;&lt;/em&gt; is of type CHAR/VARCHAR, specifies the class to associate with the &lt;em&gt;&lt;code&gt;probabilities&lt;/code&gt;&lt;/em&gt; argument.&lt;/p&gt;
&lt;/dd&gt;
&lt;/dl&gt;
&lt;h2 id=&#34;examples&#34;&gt;Examples&lt;/h2&gt;
&lt;p&gt;Execute &lt;code&gt;LIFT_TABLE&lt;/code&gt; on an input table &lt;code&gt;mtcars&lt;/code&gt;.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT LIFT_TABLE(obs::int, prob::float USING PARAMETERS num_bins=2) OVER()
    FROM (SELECT am AS obs, PREDICT_LOGISTIC_REG(mpg, cyl, disp, drat, wt, qsec, vs, gear, carb
                                                    USING PARAMETERS model_name=&amp;#39;myLogisticRegModel&amp;#39;,
                                                    type=&amp;#39;probability&amp;#39;) AS prob
             FROM mtcars) AS prediction_output;
 decision_boundary | positive_prediction_ratio |       lift       |                   comment
-------------------+---------------------------+------------------+---------------------------------------------
                 1 |                         0 |              NaN |
               0.5 |                   0.40625 | 2.46153846153846 |
                 0 |                         1 |                1 | Of 32 rows, 32 were used and 0 were ignored
(3 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The first column, &lt;code&gt;decision_boundary&lt;/code&gt;, indicates the cut-off point for whether to classify a response as 0 or 1. For instance, for each row, if &lt;code&gt;prob&lt;/code&gt; is greater than or equal to &lt;code&gt;decision_boundary&lt;/code&gt;, the response is classified as 1. If &lt;code&gt;prob&lt;/code&gt; is less than &lt;code&gt;decision_boundary&lt;/code&gt;, the response is classified as 0.&lt;/p&gt;
&lt;p&gt;The second column, &lt;code&gt;positive_prediction_ratio&lt;/code&gt;, shows the percentage of samples in class 1 that the function classified correctly using the corresponding &lt;code&gt;decision_boundary&lt;/code&gt; value.&lt;/p&gt;
&lt;p&gt;For the third column, &lt;code&gt;lift&lt;/code&gt;, the function divides the &lt;code&gt;positive_prediction_ratio&lt;/code&gt; by the percentage of rows correctly or incorrectly classified as class 1.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Sql-Reference: MSE</title>
      <link>/en/sql-reference/functions/ml-functions/model-evaluation/mse/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/sql-reference/functions/ml-functions/model-evaluation/mse/</guid>
      <description>
        
        
        &lt;p&gt;Returns a table that displays the mean squared error of the prediction and response columns in a machine learning model.&lt;/p&gt;
&lt;h2 id=&#34;syntax&#34;&gt;Syntax&lt;/h2&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;MSE ( &lt;span class=&#34;code-variable&#34;&gt;targets&lt;/span&gt;, &lt;span class=&#34;code-variable&#34;&gt;predictions&lt;/span&gt; ) OVER()
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;arguments&#34;&gt;Arguments&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;em&gt;&lt;code&gt;targets&lt;/code&gt;&lt;/em&gt;&lt;/dt&gt;
&lt;dd&gt;The model response variable, of type FLOAT.&lt;/dd&gt;
&lt;dt&gt;&lt;em&gt;&lt;code&gt;predictions&lt;/code&gt;&lt;/em&gt;&lt;/dt&gt;
&lt;dd&gt;A FLOAT input column that contains predicted values for the response variable.&lt;/dd&gt;
&lt;/dl&gt;
&lt;h2 id=&#34;examples&#34;&gt;Examples&lt;/h2&gt;
&lt;p&gt;Execute the MSE function on input table &lt;code&gt;faithful_testing&lt;/code&gt;. The response variables appear in the column &lt;code&gt;obs&lt;/code&gt;, while the prediction variables appear in the column &lt;code&gt;prediction&lt;/code&gt;.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT MSE(obs, prediction) OVER()
   FROM (SELECT eruptions AS obs,
                PREDICT_LINEAR_REG (waiting USING PARAMETERS model_name=&amp;#39;myLinearRegModel&amp;#39;) AS prediction
         FROM faithful_testing) AS prediction_output;
        mse        |                   Comments
-------------------+-----------------------------------------------
 0.252925741352641 | Of 110 rows, 110 were used and 0 were ignored
(1 row)
&lt;/code&gt;&lt;/pre&gt;
      </description>
    </item>
    
    <item>
      <title>Sql-Reference: PRC</title>
      <link>/en/sql-reference/functions/ml-functions/model-evaluation/prc/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/sql-reference/functions/ml-functions/model-evaluation/prc/</guid>
      <description>
        
        
        &lt;p&gt;Returns a table that displays the points on a receiver precision recall (PR) curve.&lt;/p&gt;
&lt;h2 id=&#34;syntax&#34;&gt;Syntax&lt;/h2&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;PRC ( &lt;span class=&#34;code-variable&#34;&gt;targets&lt;/span&gt;, &lt;span class=&#34;code-variable&#34;&gt;probabilities&lt;/span&gt;
       [ USING PARAMETERS
             [num_bins = &lt;span class=&#34;code-variable&#34;&gt;num-bins&lt;/span&gt;]
             [, f1_score = &lt;span class=&#34;code-variable&#34;&gt;return-score&lt;/span&gt; ]
             [, main_class = &lt;span class=&#34;code-variable&#34;&gt;class-name&lt;/span&gt; ] )
OVER()
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;arguments&#34;&gt;Arguments&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;em&gt;&lt;code&gt;targets&lt;/code&gt;&lt;/em&gt;&lt;/dt&gt;
&lt;dd&gt;An input column that contains the true values of the response variable, one of the following data types: INTEGER, BOOLEAN, or CHAR/VARCHAR. Depending on the column data type, the function processes column data as follows:
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;INTEGER: Uses the input column as containing the true value of the response variable.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;BOOLEAN: Resolves Yes to 1, 0 to No.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;CHAR/VARCHAR: Resolves the value specified by parameter &lt;code&gt;main_class&lt;/code&gt; to 1, all other values to 0.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

If the input column is of data type INTEGER or BOOLEAN, the function ignores parameter &lt;code&gt;main_class&lt;/code&gt;.

&lt;/div&gt;
&lt;/dd&gt;
&lt;dt&gt;&lt;em&gt;&lt;code&gt;probabilities&lt;/code&gt;&lt;/em&gt;&lt;/dt&gt;
&lt;dd&gt;A FLOAT input column that contains the predicted probability of response being the main class, set to 1 if &lt;em&gt;&lt;code&gt;targets&lt;/code&gt;&lt;/em&gt; is of type INTEGER.&lt;/dd&gt;
&lt;/dl&gt;

&lt;h2 id=&#34;parameters&#34;&gt;Parameters&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;code&gt;num_bins&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;&lt;p&gt;An integer value that determines the number of decision boundaries. Decision boundaries are set at equally spaced intervals between 0 and 1, inclusive. The function computes the table at each &lt;em&gt;&lt;code&gt;num-bin&lt;/code&gt;&lt;/em&gt; + 1 point.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Default&lt;/strong&gt;: 100&lt;/p&gt;
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;f1_score&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;A Boolean that specifies whether to return a column that contains the f1 score—the harmonic average of the precision and recall measures, where an F1 score reaches its best value at 1 (perfect precision and recall) and worst at 0.
&lt;p&gt;&lt;strong&gt;Default:&lt;/strong&gt; false&lt;/p&gt;
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;main_class&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;&lt;p&gt;Used only if &lt;em&gt;&lt;code&gt;targets&lt;/code&gt;&lt;/em&gt; is of type CHAR/VARCHAR, specifies the class to associate with the &lt;em&gt;&lt;code&gt;probabilities&lt;/code&gt;&lt;/em&gt; argument.&lt;/p&gt;
&lt;/dd&gt;
&lt;/dl&gt;
&lt;h2 id=&#34;examples&#34;&gt;Examples&lt;/h2&gt;
&lt;p&gt;Execute the PRC function on an input table named &lt;code&gt;mtcars&lt;/code&gt;. The response variables appear in the column &lt;code&gt;obs&lt;/code&gt;, while the prediction variables appear in column &lt;code&gt;pred&lt;/code&gt;.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT PRC(obs::int, prob::float USING PARAMETERS num_bins=2, f1_score=true) OVER()
    FROM (SELECT am AS obs,
                    PREDICT_LOGISTIC_REG (mpg, cyl, disp, drat, wt, qsec, vs, gear, carb
                          USING PARAMETERS model_name=&amp;#39;myLogisticRegModel&amp;#39;,
                                           type=&amp;#39;probability&amp;#39;) AS prob
             FROM mtcars) AS prediction_output;
decision_boundary | recall | precision |     f1_score      |     comment
------------------+--------+-----------+-------------------+--------------------------------------------
0                 |      1 |   0.40625 | 0.577777777777778 |
0.5               |      1 |         1 |                 1 | Of 32 rows, 32 were used and 0 were ignored
(2 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The first column, &lt;code&gt;decision_boundary&lt;/code&gt;, indicates the cut-off point for whether to classify a response as 0 or 1. For example, in each row, if the probability is equal to or greater than &lt;code&gt;decision_boundary&lt;/code&gt;, the response is classified as 1. If the probability is less than &lt;code&gt;decision_boundary&lt;/code&gt;, the response is classified as 0.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Sql-Reference: READ_TREE</title>
      <link>/en/sql-reference/functions/ml-functions/model-evaluation/read-tree/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/sql-reference/functions/ml-functions/model-evaluation/read-tree/</guid>
      <description>
        
        
        &lt;p&gt;Reads the contents of trees within the random forest or XGBoost model.&lt;/p&gt;
&lt;h2 id=&#34;syntax&#34;&gt;Syntax&lt;/h2&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;READ_TREE ( USING PARAMETERS model_name = &amp;#39;&lt;span class=&#34;code-variable&#34;&gt;model-name&lt;/span&gt;&amp;#39; [, tree_id = &lt;span class=&#34;code-variable&#34;&gt;tree-id&lt;/span&gt;] [, format = &amp;#39;&lt;span class=&#34;code-variable&#34;&gt;format&lt;/span&gt;&amp;#39;] )
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;parameters&#34;&gt;Parameters&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;code&gt;model_name&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;Identifies the model that is stored as a result of training, where &lt;em&gt;&lt;code&gt;model-name&lt;/code&gt;&lt;/em&gt; conforms to conventions described in &lt;a href=&#34;../../../../../en/sql-reference/language-elements/identifiers/#&#34;&gt;Identifiers&lt;/a&gt;. It must also be unique among all names of sequences, tables, projections, views, and models within the same schema.&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;tree_id&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;The tree identifier, an integer between 0 and &lt;em&gt;&lt;code&gt;n&lt;/code&gt;&lt;/em&gt;-1, where &lt;em&gt;&lt;code&gt;n&lt;/code&gt;&lt;/em&gt; is the number of trees in the random forest or XGBoost model. If you omit this parameter, all trees are returned.&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;format&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;Output format of the returned tree, one of the following:
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;tabular&lt;/code&gt;: Returns a table with the twelve output columns.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;graphviz&lt;/code&gt;: Returns DOT language source that can be passed to a graphviz tool and render a graphic visualization of the tree.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/dd&gt;
&lt;/dl&gt;
&lt;h2 id=&#34;privileges&#34;&gt;Privileges&lt;/h2&gt;
&lt;p&gt;Non-superusers: USAGE privileges on the model&lt;/p&gt;
&lt;h2 id=&#34;examples&#34;&gt;Examples&lt;/h2&gt;
&lt;p&gt;Get tabular output from READ_TREE for a random forest model:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT READ_TREE ( USING PARAMETERS model_name=&amp;#39;myRFModel&amp;#39;, tree_id=1 ,
format= &amp;#39;tabular&amp;#39;) LIMIT 2;
-[ RECORD 1 ]-------------+-------------------
tree_id                   | 1
node_id                   | 1
node_depth                | 0
is_leaf                   | f
is_categorical_split      | f
split_predictor           | petal_length
split_value               | 1.921875
weighted_information_gain | 0.111242236024845
left_child_id             | 2
right_child_id            | 3
prediction                |
probability/variance      |

-[ RECORD 2 ]-------------+-------------------
tree_id                   | 1
node_id                   | 2
node_depth                | 1
is_leaf                   | t
is_categorical_split      |
split_predictor           |
split_value               |
weighted_information_gain |
left_child_id             |
right_child_id            |
prediction                | setosa
probability/variance      | 1
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Get &lt;a href=&#34;http://graphviz.org/&#34;&gt;graphviz&lt;/a&gt;-formatted output from READ_TREE:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT READ_TREE ( USING PARAMETERS model_name=&amp;#39;myRFModel&amp;#39;, tree_id=1 ,
format= &amp;#39;graphviz&amp;#39;)LIMIT 1;

-[ RECORD 1 ]+-------------------------------------------------------------------
---------------------------------------------------------------------------------
tree_id      | 1
tree_digraph | digraph Tree{
1 [label=&amp;#34;petal_length &amp;lt; 1.921875 ?&amp;#34;, color=&amp;#34;blue&amp;#34;];
1 -&amp;gt; 2 [label=&amp;#34;yes&amp;#34;, color=&amp;#34;black&amp;#34;];
1 -&amp;gt; 3 [label=&amp;#34;no&amp;#34;, color=&amp;#34;black&amp;#34;];
2 [label=&amp;#34;prediction: setosa, probability: 1&amp;#34;, color=&amp;#34;red&amp;#34;];
3 [label=&amp;#34;petal_length &amp;lt; 4.871875 ?&amp;#34;, color=&amp;#34;blue&amp;#34;];
3 -&amp;gt; 6 [label=&amp;#34;yes&amp;#34;, color=&amp;#34;black&amp;#34;];
3 -&amp;gt; 7 [label=&amp;#34;no&amp;#34;, color=&amp;#34;black&amp;#34;];
6 [label=&amp;#34;prediction: versicolor, probability: 1&amp;#34;, color=&amp;#34;red&amp;#34;];
7 [label=&amp;#34;prediction: virginica, probability: 1&amp;#34;, color=&amp;#34;red&amp;#34;];
}
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This renders as follows:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;../../../../../images/machine-learning/xgb-cls-graphviz.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;see-also&#34;&gt;See also&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/ml-algorithms/rf-classifier/#&#34;&gt;RF_CLASSIFIER&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/ml-algorithms/rf-regressor/#&#34;&gt;RF_REGRESSOR&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/ml-algorithms/xgb-classifier/#&#34;&gt;XGB_CLASSIFIER&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/ml-algorithms/xgb-regressor/#&#34;&gt;XGB_REGRESSOR&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Sql-Reference: RF_PREDICTOR_IMPORTANCE</title>
      <link>/en/sql-reference/functions/ml-functions/model-evaluation/rf-predictor-importance/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/sql-reference/functions/ml-functions/model-evaluation/rf-predictor-importance/</guid>
      <description>
        
        
        &lt;p&gt;Measures the importance of the predictors in a random forest model using the Mean Decrease Impurity (MDI) approach. The importance vector is normalized to sum to 1.&lt;/p&gt;
&lt;h2 id=&#34;syntax&#34;&gt;Syntax&lt;/h2&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;RF_PREDICTOR_IMPORTANCE ( USING PARAMETERS model_name = &amp;#39;&lt;span class=&#34;code-variable&#34;&gt;model-name&lt;/span&gt;&amp;#39; [, tree_id = &lt;span class=&#34;code-variable&#34;&gt;tree-id&lt;/span&gt;] )
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;parameters&#34;&gt;Parameters&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;code&gt;model_name&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;Identifies the model that is stored as a result of the training, where &lt;em&gt;&lt;code&gt;model-name&lt;/code&gt;&lt;/em&gt; must be of type &lt;code&gt;rf_classifier&lt;/code&gt; or &lt;code&gt;rf_regressor&lt;/code&gt;.&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;tree_id&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;Identifies the tree to process, an integer between 0 and &lt;em&gt;&lt;code&gt;n&lt;/code&gt;&lt;/em&gt;-1, where &lt;em&gt;&lt;code&gt;n&lt;/code&gt;&lt;/em&gt; is the number of trees in the forest. If you omit this parameter, the function uses all trees to measure importance values.&lt;/dd&gt;
&lt;/dl&gt;
&lt;h2 id=&#34;privileges&#34;&gt;Privileges&lt;/h2&gt;
&lt;p&gt;Non-superusers: USAGE privileges on the model&lt;/p&gt;
&lt;h2 id=&#34;examples&#34;&gt;Examples&lt;/h2&gt;
&lt;p&gt;This example shows how you can use the RF_PREDICTOR_IMPORTANCE function.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT RF_PREDICTOR_IMPORTANCE ( USING PARAMETERS model_name = &amp;#39;myRFModel&amp;#39;);
 predictor_index | predictor_name | importance_value
-----------------+----------------+--------------------
               0 | sepal.length   | 0.106763318092655
               1 | sepal.width    | 0.0279536658041994
               2 | petal.length   | 0.499198722346586
               3 | petal.width    | 0.366084293756561
(4 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;see-also&#34;&gt;See also&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/ml-algorithms/rf-classifier/#&#34;&gt;RF_CLASSIFIER&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/ml-algorithms/rf-regressor/#&#34;&gt;RF_REGRESSOR&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Sql-Reference: ROC</title>
      <link>/en/sql-reference/functions/ml-functions/model-evaluation/roc/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/sql-reference/functions/ml-functions/model-evaluation/roc/</guid>
      <description>
        
        
        &lt;p&gt;Returns a table that displays the points on a receiver operating characteristic curve. The &lt;code&gt;ROC&lt;/code&gt; function tells you the accuracy of a classification model as you raise the discrimination threshold for the model.&lt;/p&gt;
&lt;h2 id=&#34;syntax&#34;&gt;Syntax&lt;/h2&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;ROC ( &lt;span class=&#34;code-variable&#34;&gt;target&lt;/span&gt;s, &lt;span class=&#34;code-variable&#34;&gt;probabilities&lt;/span&gt;
        [ USING PARAMETERS
              [num_bins = &lt;span class=&#34;code-variable&#34;&gt;num-bins&lt;/span&gt;]
              [, AUC = &lt;span class=&#34;code-variable&#34;&gt;output&lt;/span&gt;]
              [, main_class = &lt;span class=&#34;code-variable&#34;&gt;class-name&lt;/span&gt; ] ) ] )
OVER()
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;arguments&#34;&gt;Arguments&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;em&gt;&lt;code&gt;targets&lt;/code&gt;&lt;/em&gt;&lt;/dt&gt;
&lt;dd&gt;An input column that contains the true values of the response variable, one of the following data types: INTEGER, BOOLEAN, or CHAR/VARCHAR. Depending on the column data type, the function processes column data as follows:
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;INTEGER: Uses the input column as containing the true value of the response variable.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;BOOLEAN: Resolves Yes to 1, 0 to No.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;CHAR/VARCHAR: Resolves the value specified by parameter &lt;code&gt;main_class&lt;/code&gt; to 1, all other values to 0.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

If the input column is of data type INTEGER or BOOLEAN, the function ignores parameter &lt;code&gt;main_class&lt;/code&gt;.

&lt;/div&gt;
&lt;/dd&gt;
&lt;dt&gt;&lt;em&gt;&lt;code&gt;probabilities&lt;/code&gt;&lt;/em&gt;&lt;/dt&gt;
&lt;dd&gt;A FLOAT input column that contains the predicted probability of response being the main class, set to 1 if &lt;em&gt;&lt;code&gt;targets&lt;/code&gt;&lt;/em&gt; is of type INTEGER.&lt;/dd&gt;
&lt;/dl&gt;

&lt;h2 id=&#34;parameters&#34;&gt;Parameters&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;code&gt;num_bins&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;&lt;p&gt;An integer value that determines the number of decision boundaries. Decision boundaries are set at equally spaced intervals between 0 and 1, inclusive. The function computes the table at each &lt;em&gt;&lt;code&gt;num-bin&lt;/code&gt;&lt;/em&gt; + 1 point.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Default&lt;/strong&gt;: 100&lt;/p&gt;

&lt;p&gt;Greater values result in more precise approximations of the AUC.&lt;/p&gt;
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;AUC&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;A Boolean value that specifies whether to output the area under the curve (AUC) value.
&lt;p&gt;&lt;strong&gt;Default:&lt;/strong&gt; True&lt;/p&gt;
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;main_class&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;&lt;p&gt;Used only if &lt;em&gt;&lt;code&gt;targets&lt;/code&gt;&lt;/em&gt; is of type CHAR/VARCHAR, specifies the class to associate with the &lt;em&gt;&lt;code&gt;probabilities&lt;/code&gt;&lt;/em&gt; argument.&lt;/p&gt;
&lt;/dd&gt;
&lt;/dl&gt;
&lt;h2 id=&#34;examples&#34;&gt;Examples&lt;/h2&gt;
&lt;p&gt;Execute &lt;code&gt;ROC&lt;/code&gt; on input table &lt;code&gt;mtcars&lt;/code&gt;. Observed class labels are in column &lt;code&gt;obs&lt;/code&gt;, predicted class labels are in column &lt;code&gt;prob&lt;/code&gt;:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT ROC(obs::int, prob::float USING PARAMETERS num_bins=5, AUC = True) OVER()
    FROM (SELECT am AS obs,
          PREDICT_LOGISTIC_REG (mpg, cyl, disp, drat, wt, qsec, vs, gear, carb
               USING PARAMETERS
                  model_name=&amp;#39;myLogisticRegModel&amp;#39;, type=&amp;#39;probability&amp;#39;) AS prob
   FROM mtcars) AS prediction_output;
 decision_boundary | false_positive_rate | true_positive_rate | AUC |comment
-------------------+---------------------+--------------------+-----+-----------------------------------
0                  |                   1 |                  1 |     |
0.5                |                   0 |                  1 |     |
1                  |                   0 |                  0 |   1 | Of 32 rows,32 were used and 0 were ignoreded
(3 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The function returns a table with the following results:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;decision_boundary&lt;/code&gt; indicates the cut-off point for whether to classify a response as 0 or 1. In each row, if &lt;code&gt;prob&lt;/code&gt; is equal to or greater than &lt;code&gt;decision_boundary&lt;/code&gt;, the response is classified as 1. If &lt;code&gt;prob&lt;/code&gt; is less than &lt;code&gt;decision_boundary&lt;/code&gt;, the response is classified as 0.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;false_positive_rate&lt;/code&gt; shows the percentage of false positives (when 0 is classified as 1) in the corresponding &lt;code&gt;decision_boundary&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;true_positive_rate&lt;/code&gt; shows the percentage of rows that were classified as 1 and also belong to class 1.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Sql-Reference: RSQUARED</title>
      <link>/en/sql-reference/functions/ml-functions/model-evaluation/rsquared/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/sql-reference/functions/ml-functions/model-evaluation/rsquared/</guid>
      <description>
        
        
        &lt;p&gt;Returns a table with the R-squared value of the predictions in a regression model.&lt;/p&gt;
&lt;h2 id=&#34;syntax&#34;&gt;Syntax&lt;/h2&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;RSQUARED ( &lt;span class=&#34;code-variable&#34;&gt;target&lt;/span&gt;s, &lt;span class=&#34;code-variable&#34;&gt;predictions&lt;/span&gt; ) OVER()
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;admonition important&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Important&lt;/h4&gt;
The &lt;code&gt;OVER()&lt;/code&gt; clause must be empty.
&lt;/div&gt;
&lt;h2 id=&#34;arguments&#34;&gt;Arguments&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;em&gt;&lt;code&gt;targets&lt;/code&gt;&lt;/em&gt;&lt;/dt&gt;
&lt;dd&gt;A FLOAT response variable for the model.&lt;/dd&gt;
&lt;dt&gt;&lt;em&gt;&lt;code&gt;predictions&lt;/code&gt;&lt;/em&gt;&lt;/dt&gt;
&lt;dd&gt;A FLOAT input column that contains the predicted values for the response variable.&lt;/dd&gt;
&lt;/dl&gt;
&lt;h2 id=&#34;examples&#34;&gt;Examples&lt;/h2&gt;
&lt;p&gt;This example shows how to execute the &lt;code&gt;RSQUARED&lt;/code&gt; function on an input table named &lt;code&gt;faithful_testing&lt;/code&gt;. The observed values of the response variable appear in the column, &lt;code&gt;obs&lt;/code&gt;, while the predicted values of the response variable appear in the column, &lt;code&gt;pred&lt;/code&gt;.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT RSQUARED(obs, prediction) OVER()
     FROM (SELECT eruptions AS obs,
                  PREDICT_LINEAR_REG (waiting
                                       USING PARAMETERS model_name=&amp;#39;myLinearRegModel&amp;#39;) AS prediction
           FROM faithful_testing) AS prediction_output;
        rsq        |                    comment
-------------------+-----------------------------------------------
 0.801392981147911 | Of 110 rows, 110 were used and 0 were ignored
(1 row)
&lt;/code&gt;&lt;/pre&gt;
      </description>
    </item>
    
    <item>
      <title>Sql-Reference: XGB_PREDICTOR_IMPORTANCE</title>
      <link>/en/sql-reference/functions/ml-functions/model-evaluation/xgb-predictor-importance/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/sql-reference/functions/ml-functions/model-evaluation/xgb-predictor-importance/</guid>
      <description>
        
        
        &lt;p&gt;Measures the importance of the predictors in an XGBoost model. The function outputs three measures of importance for each predictor:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;frequency&lt;/code&gt;: relative number of times the model uses a predictor to split the data.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;total_gain&lt;/code&gt;: relative contribution of a predictor to the model based on the total &lt;a href=&#34;https://en.wikipedia.org/wiki/Information_gain_(decision_tree)&#34;&gt;information gain&lt;/a&gt; across a predictor&#39;s splits. A higher value means more predictive importance.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;avg_gain&lt;/code&gt;: relative contribution of a predictor to the model based on the average information gain across a predictor&#39;s splits.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The sum of each importance measure is normalized to one across all predictors.&lt;/p&gt;
&lt;h2 id=&#34;syntax&#34;&gt;Syntax&lt;/h2&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;XGB_PREDICTOR_IMPORTANCE ( USING PARAMETERS &lt;span class=&#34;code-variable&#34;&gt;param&lt;/span&gt;=&lt;span class=&#34;code-variable&#34;&gt;value&lt;/span&gt;[,...] )
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;parameters&#34;&gt;Parameters&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;code&gt;model_name&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;Name of the model, which must be of type &lt;code&gt;xgb_classifier&lt;/code&gt; or &lt;code&gt;xgb_regressor&lt;/code&gt;.&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;tree_id&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;Integer in the range [0, &lt;em&gt;&lt;code&gt;n&lt;/code&gt;&lt;/em&gt;-1], where &lt;em&gt;&lt;code&gt;n&lt;/code&gt;&lt;/em&gt; is the number of trees in &lt;code&gt;model_name&lt;/code&gt;, that specifies the tree to process. If you omit this parameter, the function uses all trees in the model to measure predictor importance values.&lt;/dd&gt;
&lt;/dl&gt;
&lt;h2 id=&#34;privileges&#34;&gt;Privileges&lt;/h2&gt;
&lt;p&gt;Non-superusers: USAGE privileges on the model&lt;/p&gt;
&lt;h2 id=&#34;examples&#34;&gt;Examples&lt;/h2&gt;
&lt;p&gt;The following example measures the importance of the predictors in the model &#39;xgb_iris&#39;, an XGBoost classifier model, across all trees:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT XGB_PREDICTOR_IMPORTANCE( USING PARAMETERS model_name = &amp;#39;xgb_iris&amp;#39; );
 predictor_index | predictor_name |     frequency     |     total_gain     |      avg_gain
-----------------+----------------+-------------------+--------------------+--------------------
               0 | sepal_length   |  0.15384615957737 |    0.0183021749937 | 0.0370849960701401
               1 | sepal_width    | 0.215384617447853 | 0.0154729501420881 | 0.0223944615251752
               2 | petal_length   | 0.369230777025223 |  0.607349886817728 |  0.512770753876444
               3 | petal_width    | 0.261538475751877 |  0.358874988046484 |  0.427749788528241
(4 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;To sort the predictors by importance values, you can use a nested query with an ORDER BY clause. The following sorts the model predictors by descending &lt;code&gt;avg_gain&lt;/code&gt;:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT * FROM (SELECT XGB_PREDICTOR_IMPORTANCE( USING PARAMETERS model_name = &amp;#39;xgb_iris&amp;#39; )) AS importances ORDER BY avg_gain DESC;
 predictor_index | predictor_name |     frequency     |     total_gain     |      avg_gain
-----------------+----------------+-------------------+--------------------+--------------------
               2 | petal_length   | 0.369230777025223 |  0.607349886817728 |  0.512770753876444
               3 | petal_width    | 0.261538475751877 |  0.358874988046484 |  0.427749788528241
               0 | sepal_length   |  0.15384615957737 |    0.0183021749937 | 0.0370849960701401
               1 | sepal_width    | 0.215384617447853 | 0.0154729501420881 | 0.0223944615251752
(4 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;see-also&#34;&gt;See also&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/ml-algorithms/xgb-classifier/#&#34;&gt;XGB_CLASSIFIER&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/ml-algorithms/xgb-regressor/#&#34;&gt;XGB_REGRESSOR&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
  </channel>
</rss>
