<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>OpenText Analytics Database 26.2.x – Naive bayes</title>
    <link>/en/data-analysis/ml-predictive-analytics/classification-algorithms/naive-bayes/</link>
    <description>Recent content in Naive bayes on OpenText Analytics Database 26.2.x</description>
    <generator>Hugo -- gohugo.io</generator>
    
	  <atom:link href="/en/data-analysis/ml-predictive-analytics/classification-algorithms/naive-bayes/index.xml" rel="self" type="application/rss+xml" />
    
    
      
        
      
    
    
    <item>
      <title>Data-Analysis: Classifying data using naive bayes</title>
      <link>/en/data-analysis/ml-predictive-analytics/classification-algorithms/naive-bayes/classifying-data-using-naive-bayes/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/data-analysis/ml-predictive-analytics/classification-algorithms/naive-bayes/classifying-data-using-naive-bayes/</guid>
      <description>
        
        
        &lt;p&gt;This Naive Bayes example uses the HouseVotes84 data set to show you how to build a model. With this model, you can predict which party the member of the United States Congress is affiliated based on their voting record. To aid in classifying the data it has been cleaned, and any missed votes have been replaced. The cleaned data replaces missed votes with the voter&#39;s party majority vote. For example, suppose a member of the Democrats had a missing value for vote1 and majority of the Democrats voted in favor. This example replaces all missing Democrats&#39; votes for vote1 with a vote in favor.&lt;/p&gt;
&lt;p&gt;In this example, approximately 75% of the cleaned HouseVotes84 data is randomly selected and copied to a training table. The remaining cleaned HouseVotes84 data is used as a testing table.&lt;/p&gt;
Before you begin the example, &lt;a href=&#34;../../../../../en/data-analysis/ml-predictive-analytics/download-ml-example-data/&#34;&gt;load the Machine Learning sample data&lt;/a&gt;.
&lt;p&gt;You must also load the &lt;code&gt;naive_bayes_data_prepration.sql&lt;/code&gt; script:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ /opt/vertica/bin/vsql -d &amp;lt;name of your database&amp;gt; -f naive_bayes_data_preparation.sql
&lt;/code&gt;&lt;/pre&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Create the Naive Bayes model, named &lt;code&gt;naive_house84_model&lt;/code&gt;, using the &lt;code&gt;house84_train&lt;/code&gt; training data.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT NAIVE_BAYES(&amp;#39;naive_house84_model&amp;#39;, &amp;#39;house84_train&amp;#39;, &amp;#39;party&amp;#39;,
                      &amp;#39;*&amp;#39; USING PARAMETERS exclude_columns=&amp;#39;party, id&amp;#39;);
                  NAIVE_BAYES
------------------------------------------------
 Finished. Accepted Rows: 315  Rejected Rows: 0
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Create a new table, named &lt;code&gt;predicted_party_naive&lt;/code&gt;. Populate this table with the prediction outputs you obtain from the PREDICT_NAIVE_BAYES function on your test data.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; CREATE TABLE predicted_party_naive
     AS SELECT party,
          PREDICT_NAIVE_BAYES (vote1, vote2, vote3, vote4, vote5,
                               vote6, vote7, vote8, vote9, vote10,
                               vote11, vote12, vote13, vote14,
                               vote15, vote16
                                 USING PARAMETERS model_name = &amp;#39;naive_house84_model&amp;#39;,
                                                  type = &amp;#39;response&amp;#39;) AS Predicted_Party
       FROM house84_test;
CREATE TABLE
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Calculate the accuracy of the model&#39;s predictions.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;
=&amp;gt; SELECT  (Predictions.Num_Correct_Predictions / Count.Total_Count) AS Percent_Accuracy
    FROM (  SELECT COUNT(Predicted_Party) AS Num_Correct_Predictions
        FROM predicted_party_naive
        WHERE party = Predicted_Party
         ) AS Predictions,
         (  SELECT COUNT(party) AS Total_Count
               FROM predicted_party_naive
            ) AS Count;
   Percent_Accuracy
----------------------
 0.933333333333333333
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The model correctly predicted the party of the members of Congress based on their voting patterns with 93% accuracy.&lt;/p&gt;
&lt;h2 id=&#34;viewing-the-probability-of-each-class&#34;&gt;Viewing the probability of each class&lt;/h2&gt;
&lt;p&gt;You can also view the probability of each class. Use PREDICT_NAIVE_BAYES_CLASSES to see the probability of each class.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT PREDICT_NAIVE_BAYES_CLASSES (id, vote1, vote2, vote3, vote4, vote5,
                                       vote6, vote7, vote8, vote9, vote10,
                                       vote11, vote12, vote13, vote14,
                                       vote15, vote16
                                       USING PARAMETERS model_name = &amp;#39;naive_house84_model&amp;#39;,
                                                        key_columns = &amp;#39;id&amp;#39;, exclude_columns = &amp;#39;id&amp;#39;,
                                                        classes = &amp;#39;democrat, republican&amp;#39;)
        OVER() FROM house84_test;
 id  | Predicted  |    Probability    |       democrat       |      republican
-----+------------+-------------------+----------------------+----------------------
 368 | democrat   |                 1 |                    1 |                    0
 372 | democrat   |                 1 |                    1 |                    0
 374 | democrat   |                 1 |                    1 |                    0
 378 | republican | 0.999999962214987 | 3.77850125111219e-08 |    0.999999962214987
 384 | democrat   |                 1 |                    1 |                    0
 387 | democrat   |                 1 |                    1 |                    0
 406 | republican | 0.999999945980143 | 5.40198564592332e-08 |    0.999999945980143
 419 | democrat   |                 1 |                    1 |                    0
 421 | republican | 0.922808855631005 |   0.0771911443689949 |    0.922808855631005
.
.
.
(109 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;see-also&#34;&gt;See also&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/ml-algorithms/naive-bayes/#&#34;&gt;NAIVE_BAYES&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/transformation-functions/predict-naive-bayes/#&#34;&gt;PREDICT_NAIVE_BAYES&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../../../en/sql-reference/functions/ml-functions/transformation-functions/predict-naive-bayes-classes/#&#34;&gt;PREDICT_NAIVE_BAYES_CLASSES&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
  </channel>
</rss>
