<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Vertica Documentation – Handling messy data</title>
    <link>/en/data-load/handling-messy-data/</link>
    <description>Recent content in Handling messy data on Vertica Documentation</description>
    <generator>Hugo -- gohugo.io</generator>
    
	  <atom:link href="/en/data-load/handling-messy-data/index.xml" rel="self" type="application/rss+xml" />
    
    
      
        
      
    
    
    <item>
      <title>Data-Load: Saving load rejections (REJECTED DATA)</title>
      <link>/en/data-load/handling-messy-data/saving-load-rejections-rejected-data/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/data-load/handling-messy-data/saving-load-rejections-rejected-data/</guid>
      <description>
        
        
        &lt;p&gt;&lt;code&gt;COPY&lt;/code&gt; load rejections are data rows that did not load due to a parser exception or, optionally, transformation error. By default, if you do not specify a rejected data file, &lt;code&gt;COPY&lt;/code&gt; saves rejected data files to this location:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;&lt;span class=&#34;code-variable&#34;&gt;catalog_dir&lt;/span&gt;/CopyErrorLogs/&lt;span class=&#34;code-variable&#34;&gt;target_table&lt;/span&gt;-&lt;span class=&#34;code-variable&#34;&gt;source&lt;/span&gt;-copy-from-rejected-data.`*`n`*
&lt;/code&gt;&lt;/pre&gt;
&lt;table class=&#34;table table-bordered&#34; &gt;



&lt;tr&gt; 

&lt;td &gt;
&lt;em&gt;catalog_dir&lt;/em&gt;/&lt;/td&gt; 

&lt;td &gt;




&lt;p&gt;The database catalog files directory, for example:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;/home/dbadmin/VMart/v_vmart_node0001_catalog&lt;/code&gt;&lt;/p&gt;
&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;
&lt;em&gt;target_table&lt;/em&gt;&lt;/td&gt; 

&lt;td &gt;


The table into which data was loaded (&lt;em&gt;target_table&lt;/em&gt;).&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;
&lt;em&gt;source&lt;/em&gt;&lt;/td&gt; 

&lt;td &gt;


The source of the load data, which can be STDIN, or a file name, such as &lt;code&gt;baseball.csv&lt;/code&gt;. &lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;
&lt;code&gt;copy-from-rejected-data.&lt;/code&gt;&lt;em&gt;n&lt;/em&gt;&lt;/td&gt; 

&lt;td &gt;




&lt;p&gt;The default name for a rejected data file, followed by &lt;em&gt;n&lt;/em&gt; suffix, indicating the number of files, such as &lt;code&gt;.1&lt;/code&gt;, &lt;code&gt;.2, .3&lt;/code&gt;. For example, this default file name indicates file 3 after loading from STDIN:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;fw-STDIN-copy-from-rejected-data.3&lt;/code&gt;.&lt;/p&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;

&lt;p&gt;Saving rejected data to the default location, or to a location of your choice, lets you review the file contents, resolve problems, and reload the data from the rejected data files. Saving rejected data to a table, lets you query the table to see rejected data rows and the reasons (exceptions) why the rows could not be parsed. Vertica recommends saving rejected data to a table.&lt;/p&gt;
&lt;h2 id=&#34;multiple-rejected-data-files&#34;&gt;Multiple rejected data files&lt;/h2&gt;
&lt;p&gt;Unless a load is very small (&amp;lt; 10MB), COPY creates more than one file to hold rejected rows. Several factors determine how many files COPY creates for rejected data. Here are some of the factors:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Number of sources being loaded&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Total number of rejected rows&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Size of the source file (or files)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Cooperative parsing and number of threads being used&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;UDLs that support apportioned loads&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;For your own COPY parser, the number of objects returned from &lt;code&gt;prepareUDSources()&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;naming-conventions-for-rejected-files&#34;&gt;Naming conventions for rejected files&lt;/h2&gt;
&lt;p&gt;You can specify one or more rejected data files with the files you are loading. Use the REJECTED DATA parameter to specify a file location and name, and separate consecutive rejected data file names with a comma (,). Do not use the &lt;code&gt;ON ANY NODE&lt;/code&gt; option because it is applicable only to load files.&lt;/p&gt;
&lt;p&gt;If you specify one or more files, and COPY requires multiple files for rejected data, COPY uses the rejected data file names you supply as a prefix, and appends a numeric suffix to each rejected data file. For example, if you specify the name &lt;code&gt;my_rejects&lt;/code&gt; for the REJECTED_DATA parameter, and the file you are loading is large enough (&amp;gt; 10MB), several files such as the following will exist:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;my_rejects-1&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;my_rejects-2&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;my_rejects-3&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;COPY uses cooperative parsing by default, having the nodes parse a specific part of the file contents. Depending on the file or portion size, each thread generates at least one rejected data file per source file or portion, and returns load results to the initiator node. The file suffix is a thread index when COPY uses multiple threads (.1, .2, .3, and so on).&lt;/p&gt;
&lt;p&gt;The maximum number of rejected data files cannot be greater than the number of sources being loaded, per thread to parse any portion. The resource pool determines the maximum number of threads. For cooperative parse, use all available threads.&lt;/p&gt;
&lt;p&gt;If you use COPY with a UDL that supports apportioned load, the file suffix is an offset value. UDL&#39;s that support apportioned loading render cooperative parsing unnecessary. For apportioned loads, COPY creates at least one rejected file per data portion, and more files depending on the size of the load and number of rejected rows.&lt;/p&gt;
&lt;p&gt;For all data loads except &lt;code&gt;COPY LOCAL&lt;/code&gt;, &lt;code&gt;COPY&lt;/code&gt; behaves as follows:

&lt;table class=&#34;table table-bordered&#34; &gt;



&lt;tr&gt; 

&lt;th &gt;
No rejected data file specified...&lt;/th&gt; 

&lt;th &gt;
Rejected data file specified...&lt;/th&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;


For a single data file (&lt;strong&gt;&lt;code&gt;pathToData&lt;/code&gt;&lt;/strong&gt; or &lt;code&gt;STDIN&lt;/code&gt;), COPY stores one or more rejected data files in the default location.&lt;/td&gt; 

&lt;td &gt;


For one data file, &lt;code&gt;COPY&lt;/code&gt; interprets the rejected data path as a file, and stores all rejected data at the location. If more than one files is required from parallel processing, COPY appends a numeric suffix. If the path is not a file, &lt;code&gt;COPY&lt;/code&gt; returns an error.&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;


For multiple source files, COPY stores all rejected data in separate files in the default directory, using the source file as a file name prefix, as noted.&lt;/td&gt; 

&lt;td &gt;




&lt;p&gt;For multiple source files, &lt;code&gt;COPY&lt;/code&gt; interprets the rejected path as a directory. &lt;code&gt;COPY&lt;/code&gt; stores all information in separate files, one for each source. If path is not a directory, &lt;code&gt;COPY&lt;/code&gt; returns an error.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;COPY&lt;/code&gt; accepts only one path per node. For example, if you specify the rejected data path as my_rejected_data, &lt;code&gt;COPY&lt;/code&gt; creates a directory of that name on each node. If you provide more than one rejected data path, &lt;code&gt;COPY&lt;/code&gt; returns an error.&lt;/p&gt;
&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;
Rejected data files are returned to the initiator node.&lt;/td&gt; 

&lt;td &gt;


Rejected data files are not shipped to the initiator node.&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/p&gt;
&lt;h2 id=&#34;maximum-length-of-file-names&#34;&gt;Maximum length of file names&lt;/h2&gt;
&lt;p&gt;Loading multiple input files in one statement requires specifying full path names for each file. Keep in mind that long input file names, combined with rejected data file names, can exceed the operating system&#39;s maximum length (typically 255 characters). To work around file names that exceed the maximum length, use a path for the rejected data file that differs from the default path—for example, &lt;code&gt;\tmp\&amp;lt;shorter-file-name&amp;gt;&lt;/code&gt;.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Data-Load: Saving rejected data to a table</title>
      <link>/en/data-load/handling-messy-data/saving-rejected-data-to-table/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/data-load/handling-messy-data/saving-rejected-data-to-table/</guid>
      <description>
        
        
        &lt;p&gt;Use the &lt;code&gt;REJECTED DATA&lt;/code&gt; parameter with the &lt;code&gt;AS TABLE&lt;/code&gt; clause to specify a table in which to save rejected data. Saving rejected data to a file is mutually exclusive with using the &lt;code&gt;AS TABLE&lt;/code&gt; clause.&lt;/p&gt;
&lt;p&gt;When you use the &lt;code&gt;AS TABLE&lt;/code&gt; clause, Vertica creates a new table if one does not exist, or appends to an existing table. If no parsing rejections occur during a load, the table exists but is empty. The next time you load data, Vertica inserts any rejected rows to the existing table.&lt;/p&gt;
&lt;p&gt;The load rejection tables are a special type of table with the following capabilities and limitations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Support &lt;code&gt;SELECT&lt;/code&gt; statements&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Can use &lt;code&gt;DROP TABLE&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Cannot be created outside of a &lt;code&gt;COPY&lt;/code&gt; statement&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Do not support DML and DDL activities&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Are not &lt;a class=&#34;glosslink&#34; href=&#34;../../../en/glossary/k-safety/&#34; title=&#34;For more information, see Designing for K-Safety.&#34;&gt;K-safe&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To make the data in a rejected table K-safe, you can do one of the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Write a &lt;code&gt;CREATE TABLE..AS&lt;/code&gt; statement, such as this example:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; CREATE TABLE new_table AS SELECT * FROM rejected_table;
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Create a table to store rejected records, and run &lt;code&gt;INSERT..SELECT&lt;/code&gt; operations into the new table&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;using-copy-no-commit&#34;&gt;Using COPY NO COMMIT&lt;/h2&gt;
&lt;p&gt;If the &lt;code&gt;COPY&lt;/code&gt; statement includes options &lt;code&gt;NO COMMIT&lt;/code&gt; and &lt;code&gt;REJECTED DATA AS TABLE&lt;/code&gt;, and the &lt;em&gt;&lt;code&gt;reject-table&lt;/code&gt;&lt;/em&gt; does not already exist, Vertica Analytic Database saves the rejected data table as a LOCAL TEMP table and returns a message that a LOCAL TEMP table is being created.&lt;/p&gt;
&lt;p&gt;Rejected-data tables are useful for Extract-Load-Transform workflows, where you will likely use temporary tables more frequently. The rejected-data tables let you quickly load data and identify which records failed to load. If you load data into a temporary table that you created using the &lt;code&gt;ON COMMIT DELETE&lt;/code&gt; clause, the &lt;code&gt;COPY&lt;/code&gt; operation will not commit.&lt;/p&gt;
&lt;h2 id=&#34;location-of-rejected-data-table-records&#34;&gt;Location of rejected data table records&lt;/h2&gt;
&lt;p&gt;When you save rejected records to a table, using the &lt;code&gt;REJECTED DATA AS TABLE &lt;/code&gt;&lt;em&gt;&lt;code&gt;table_name&lt;/code&gt;&lt;/em&gt; option, the data for the table is saved in a database data subdirectory, &lt;code&gt;RejectionTableData&lt;/code&gt;. For example, for a &lt;code&gt;VMart&lt;/code&gt; database, table data files reside here:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;/home/dbadmin/VMart/v_vmart_node0001_data/RejectionTableData
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Rejected data tables include both rejected data and the reason for the rejection (exceptions), along with other data columns, described next. Vertica suggests that you periodically drop any rejected data tables that you no longer require.&lt;/p&gt;
&lt;h2 id=&#34;querying-a-rejected-data-table&#34;&gt;Querying a rejected data table&lt;/h2&gt;
&lt;p&gt;When you specify a rejected data table when loading data with &lt;code&gt;COPY&lt;/code&gt;, you can query that table for information about rejected data after the load operation is complete. For example:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Create the &lt;code&gt;loader&lt;/code&gt; table:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; CREATE TABLE loader(a INT)
CREATE TABLE
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Use &lt;code&gt;COPY&lt;/code&gt; to load values, saving rejected data to a table, &lt;code&gt;loader_rejects&lt;/code&gt;:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; COPY loader FROM STDIN REJECTED DATA AS TABLE loader_rejects;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
&amp;gt;&amp;gt; 1
&amp;gt;&amp;gt; 2
&amp;gt;&amp;gt; 3
&amp;gt;&amp;gt; a
&amp;gt;&amp;gt; \.
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Query the &lt;code&gt;loader&lt;/code&gt; table after loading data:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT * FROM loader;
 x
---
 1
 2
 3
(3 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Query the &lt;code&gt;loader_rejects&lt;/code&gt; table to see its column rows:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT * FROM loader_rejects;
-[ RECORD 1 ]-------------+--------------------------------------------
node_name                 | v_vmart_node0001
file_name                 | STDIN
session_id                | v_vmart_node0001.example.-24016:0x3439
transaction_id            | 45035996274080923
statement_id              | 1
batch_number              | 0
row_number                | 4
rejected_data             | a
rejected_data_orig_length | 1
rejected_reason           | Invalid integer format &amp;#39;a&amp;#39; for column 1 (x)
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The rejected data table has the following columns:

&lt;table class=&#34;table table-bordered&#34; &gt;



&lt;tr&gt; 

&lt;th &gt;
Column&lt;/th&gt; 

&lt;th &gt;
Data Type&lt;/th&gt; 

&lt;th &gt;
Description&lt;/th&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;


&lt;code&gt;node_name&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;
&lt;code&gt;VARCHAR&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;
The name of the Vertica node on which the input load file was located.&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;


&lt;code&gt;file_name&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;
&lt;code&gt;VARCHAR&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;
The name of the file being loaded, which applies if you loaded a file (as opposed to using STDIN).&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;


&lt;code&gt;session_id&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;
&lt;code&gt;VARCHAR&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;
The session ID number in which the COPY statement occurred.&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;


&lt;code&gt;transaction_id&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;
&lt;code&gt;INTEGER&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;
Identifier for the transaction within the session, if any; otherwise NULL.&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;



&lt;code&gt;&lt;code&gt;statement_id&lt;/code&gt;&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;
&lt;code&gt;INTEGER&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;









&lt;p&gt;The unique identification number of the statement within the transaction that included the rejected data.&lt;/p&gt;
&lt;div class=&#34;alert admonition tip&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Tip&lt;/h4&gt;
&lt;p&gt;You can use the &lt;code&gt;session_id&lt;/code&gt;, &lt;code&gt;transaction_id&lt;/code&gt;, and &lt;code&gt;statement_id&lt;/code&gt; columns to create joins with many system tables. For example, if you join against the &lt;code&gt;QUERY_REQUESTS&lt;/code&gt; table using those three columns, the &lt;code&gt;QUERY_REQUESTS.REQUEST&lt;/code&gt; column contains the actual &lt;code&gt;COPY&lt;/code&gt; statement (as a string) used to load this data.&lt;/p&gt;
&lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;


&lt;code&gt;batch_number&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;
&lt;code&gt;INTEGER&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;


INTERNAL USE. Represents which batch (chunk) the data comes from.&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;


&lt;code&gt;row_number&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;
&lt;code&gt;INTEGER&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;




&lt;p&gt;The rejected row number from the input file, or -1 if it could not be determined. The value can be -1 when using cooperative parse.&lt;/p&gt;
&lt;p&gt;Each parse operation resets the row number, so in an apportioned load, there can be several entries with the same row number but different rows.&lt;/p&gt;
&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;


&lt;code&gt;rejected_data&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;
&lt;code&gt;LONG VARCHAR&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;
The data that was not loaded.&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;


&lt;code&gt;rejected_data_orig_length&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;
&lt;code&gt;INTEGER&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;
The length of the rejected data.&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;


&lt;code&gt;rejected_reason&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;
&lt;code&gt;VARCHAR&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;
The error that caused the rejected row. This column returns the same message that exists in a load exceptions file when you do not save to a table.&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/p&gt;
&lt;h2 id=&#34;exporting-the-rejected-records-table&#34;&gt;Exporting the rejected records table&lt;/h2&gt;
&lt;p&gt;You can export the contents of the column &lt;code&gt;rejected_data&lt;/code&gt; to a file to capture only the data rejected during the first COPY statement. Then, correct the data in the file, save it, and load the updated file.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;To export rejected records:&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Create a sample table:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; CREATE TABLE t (i int);
CREATE TABLE
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Copy data directly into the table, using a table to store rejected data:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; COPY t FROM STDIN REJECTED DATA AS TABLE t_rejects;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
&amp;gt;&amp;gt; 1
&amp;gt;&amp;gt; 2
&amp;gt;&amp;gt; 3
&amp;gt;&amp;gt; 4
&amp;gt;&amp;gt; a
&amp;gt;&amp;gt; b
&amp;gt;&amp;gt; c
&amp;gt;&amp;gt; \.
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Show only tuples and set the output format:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; \t
Showing only tuples.
=&amp;gt; \a
Output format is unaligned.
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Output to a file:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; \o rejected.txt
=&amp;gt; select rejected_data from t_rejects;
=&amp;gt; \o
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Use the &lt;code&gt;cat&lt;/code&gt;command on the saved file:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;
=&amp;gt; \! cat rejected.txt
a
b
c
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;After a file exists, you can fix load errors and use the corrected file as load input to the &lt;code&gt;COPY&lt;/code&gt; statement.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Data-Load: Saving load exceptions (EXCEPTIONS)</title>
      <link>/en/data-load/handling-messy-data/saving-load-exceptions-exceptions/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/data-load/handling-messy-data/saving-load-exceptions-exceptions/</guid>
      <description>
        
        
        &lt;p&gt;COPY exceptions consist of informational messages describing why a row of data could not be parsed. The &lt;span class=&#34;sql&#34;&gt;EXCEPTIONS&lt;/span&gt; option lets you specify a file to which &lt;span class=&#34;sql&#34;&gt;COPY&lt;/span&gt; writes exceptions. If you omit this option, &lt;span class=&#34;sql&#34;&gt;COPY&lt;/span&gt; saves exception files to the following path: &lt;em&gt;&lt;code&gt;catalog-dir&lt;/code&gt;&lt;/em&gt;&lt;code&gt;/CopyErrorLogs/&lt;/code&gt;&lt;em&gt;&lt;code&gt;tablename&lt;/code&gt;&lt;/em&gt;&lt;code&gt;-&lt;/code&gt;&lt;em&gt;&lt;code&gt;sourcefilename&lt;/code&gt;&lt;/em&gt;&lt;code&gt;-copy-from-exceptions&lt;/code&gt;, where:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;&lt;code&gt;catalog-dir&lt;/code&gt;&lt;/em&gt; is the directory holding the database catalog files&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;&lt;code&gt;table&lt;/code&gt;&lt;/em&gt; is the name of the table being loaded into&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;&lt;code&gt;sourcefile&lt;/code&gt;&lt;/em&gt; is the name of the file being loaded&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

&lt;span class=&#34;sql&#34;&gt;REJECTED DATA AS TABLE&lt;/span&gt; is mutually exclusive with &lt;span class=&#34;sql&#34;&gt;EXCEPTIONS&lt;/span&gt;.

&lt;/div&gt;
&lt;p&gt;The file produced by the &lt;span class=&#34;sql&#34;&gt;EXCEPTIONS&lt;/span&gt; option indicates the line number and the reason for each exception.&lt;/p&gt;
&lt;p&gt;If copying from STDIN, the source file name is &lt;span class=&#34;sql&#34;&gt;STDIN&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;You can specify rejected data and exceptions files for individual files in a data load. Separate rejected data and exception file names with commas in the &lt;span class=&#34;sql&#34;&gt;COPY&lt;/span&gt; statement.&lt;/p&gt;
&lt;p&gt;You must specify a filename in the path to load multiple input files. Keep in mind that long table names combined with long data file names can exceed the operating system&#39;s maximum length (typically 255 characters). To work around file names exceeding the maximum length, use a path for the exceptions file that differs from the default path; for example, &lt;code&gt;/tmp/&amp;lt;shorter-file-name&amp;gt;&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;If you specify an &lt;span class=&#34;sql&#34;&gt;EXCEPTIONS&lt;/span&gt; path:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;For one data file, the path must be a file, and COPY stores all information in this file.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;For multiple data files, the path must be a directory. COPY creates one file in this directory for each data file.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Exceptions files are not stored on the initiator node.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;You can specify only one path per node.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you do not specify the &lt;span class=&#34;sql&#34;&gt;EXCEPTIONS&lt;/span&gt; path, &lt;span class=&#34;sql&#34;&gt;COPY&lt;/span&gt; stores exception files in the default directory.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Data-Load: COPY rejected data and exception files</title>
      <link>/en/data-load/handling-messy-data/copy-rejected-data-and-exception-files/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/data-load/handling-messy-data/copy-rejected-data-and-exception-files/</guid>
      <description>
        
        
        &lt;p&gt;When executing a &lt;code&gt;COPY&lt;/code&gt; statement, and parallel processing is ON (the default setting), COPY creates separate threads to process load files. Typically, the number of threads depends on the number of node cores in the system. Each node processes part of the load data. If the load succeeds overall, any parser rejections that occur during load processing are written to that node&#39;s specific rejected data and exceptions files. If the load fails, the rejected data file contents can be incomplete, or empty. If you do not specify a file name explicitly, COPY uses a default name and location for rejected data files. See the next topic for specifying your own rejected data and exception files.&lt;/p&gt;
&lt;p&gt;Both rejected data and exceptions files are saved and stored on a per-node basis. This example uses multiple files as &lt;code&gt;COPY&lt;/code&gt; inputs. Since the statement does not include either the &lt;code&gt;REJECTED DATA&lt;/code&gt; or &lt;code&gt;EXCEPTIONS&lt;/code&gt; parameters, rejected data and exceptions files are written to the default location, the database catalog subdirectory, &lt;code&gt;CopyErrorLogs&lt;/code&gt;, on each node:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;\set dir `pwd`/data/ \set remote_dir /vertica/test_dev/tmp_ms/
\set file1 &amp;#39;&amp;#39;&amp;#39;&amp;#39;:dir&amp;#39;C1_large_tbl.dat&amp;#39;&amp;#39;&amp;#39;
\set file2 &amp;#39;&amp;#39;&amp;#39;&amp;#39;:dir&amp;#39;C2_large_tbl.dat&amp;#39;&amp;#39;&amp;#39;
\set file3 &amp;#39;&amp;#39;&amp;#39;&amp;#39;:remote_dir&amp;#39;C3_large_tbl.dat&amp;#39;&amp;#39;&amp;#39;
\set file4 &amp;#39;&amp;#39;&amp;#39;&amp;#39;:remote_dir&amp;#39;C4_large_tbl.dat&amp;#39;&amp;#39;&amp;#39;
=&amp;gt;COPY large_tbl FROM :file1 ON site01,:file2 ON site01,
               :file3 ON site02,
               :file4 ON site02
               DELIMITER &amp;#39;|&amp;#39;;
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;specifying-rejected-data-and-exceptions-files&#34;&gt;Specifying rejected data and exceptions files&lt;/h2&gt;
&lt;p&gt;The optional &lt;code&gt;COPY&lt;/code&gt; &lt;code&gt;REJECTED DATA&lt;/code&gt; and &lt;code&gt;EXCEPTIONS&lt;/code&gt; parameters &lt;code&gt;&#39;path&#39;&lt;/code&gt; element lets you specify a non-default path in which to store the files.&lt;/p&gt;
&lt;p&gt;If &lt;em&gt;path&lt;/em&gt; resolves to a storage location, and the user invoking COPY is not a superuser, these are the required permissions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;The storage location must have been created (or altered) with the USER option (see &lt;a href=&#34;../../../en/sql-reference/statements/create-statements/create-location/&#34;&gt;CREATE LOCATION&lt;/a&gt; and &lt;a href=&#34;../../../en/sql-reference/functions/management-functions/storage-functions/alter-location-use/&#34;&gt;ALTER_LOCATION_USE&lt;/a&gt;)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The user must already have been granted READ access to the storage location where the file(s) exist, as described in &lt;a href=&#34;../../../en/sql-reference/statements/grant-statements/grant-storage-location/&#34;&gt;GRANT (storage location)&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Both parameters also have an optional &lt;code&gt;ON&lt;/code&gt; &lt;em&gt;nodename&lt;/em&gt; clause that uses the specified path:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;...[ EXCEPTIONS &lt;span class=&#34;code-variable&#34;&gt;&#39;path&#39;&lt;/span&gt; [ ON &lt;span class=&#34;code-variable&#34;&gt;nodename&lt;/span&gt; ] [, ...] ]...[ REJECTED DATA &lt;span class=&#34;code-variable&#34;&gt;&#39;&lt;/span&gt;path&lt;span class=&#34;code-variable&#34;&gt;&#39;&lt;/span&gt; [ ON &lt;span class=&#34;code-variable&#34;&gt;nodename&lt;/span&gt; ]&lt;span class=&#34;code-input&#34;&gt; &lt;/span&gt;[, ...] ]
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;While &lt;em&gt;&#39;path&#39;&lt;/em&gt; specifies the location of the rejected data and exceptions files (with their corresponding parameters), the optional &lt;code&gt;ON&lt;/code&gt; &lt;em&gt;nodename&lt;/em&gt; clause moves any existing rejected data and exception files on the node to the specified path on the same node.&lt;/p&gt;
&lt;h2 id=&#34;saving-rejected-data-and-exceptions-files-to-a-single-server&#34;&gt;Saving rejected data and exceptions files to a single server&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;COPY&lt;/code&gt; statement does not have a facility to merge exception and rejected data files after &lt;code&gt;COPY&lt;/code&gt; processing is complete. To see the contents of exception and rejected data files requires accessing each node&#39;s specific files.

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

To save all exceptions and rejected data files on a network host, be sure to give each node&#39;s files unique names, so that different cluster nodes do not overwrite other nodes&#39; files. For instance, if you set up a server with two directories (&lt;code&gt;/vertica/exceptions&lt;/code&gt; and &lt;code&gt;/vertica/rejections&lt;/code&gt;), specify file names for each Vertica cluster node to identify each node, such as &lt;code&gt;node01_exceptions.txt&lt;/code&gt; and &lt;code&gt;node02_exceptions.txt&lt;/code&gt;. This way, each cluster node&#39;s files are easily distinguishable in the exceptions and rejections directories.

&lt;/div&gt;&lt;/p&gt;
&lt;h2 id=&#34;using-vsql-variables-for-rejected-data-and-exceptions-files&#34;&gt;Using VSQL variables for rejected data and exceptions files&lt;/h2&gt;
&lt;p&gt;This example uses &lt;code&gt;vsql&lt;/code&gt; variables to specify the path and file names to use with the &lt;code&gt;exceptions&lt;/code&gt; and &lt;code&gt;rejected data&lt;/code&gt; parameters (&lt;code&gt;except_s1&lt;/code&gt; and &lt;code&gt;reject_s1&lt;/code&gt;). The &lt;code&gt;COPY&lt;/code&gt; statement specifies a single input file (&lt;code&gt;large_tbl&lt;/code&gt;) on the initiator node:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;\set dir `pwd`/data/ \set file1 &amp;#39;&amp;#39;&amp;#39;&amp;#39;:dir&amp;#39;C1_large_tbl.dat&amp;#39;&amp;#39;&amp;#39;
\set except_s1 &amp;#39;&amp;#39;&amp;#39;&amp;#39;:dir&amp;#39;exceptions&amp;#39;&amp;#39;&amp;#39;
\set reject_s1 &amp;#39;&amp;#39;&amp;#39;&amp;#39;:dir&amp;#39;rejections&amp;#39;&amp;#39;&amp;#39;

COPY large_tbl FROM :file1 ON site01 DELIMITER &amp;#39;|&amp;#39;
REJECTED DATA :reject_s1 ON site01
EXCEPTIONS :except_s1 ON site01;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This example uses variables to specify exception and rejected date files (&lt;code&gt;except_s2&lt;/code&gt; and &lt;code&gt;reject_s2&lt;/code&gt;) on a remote node. The COPY statement consists of a single input file on a remote node (&lt;code&gt;site02&lt;/code&gt;):&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;\set remote_dir /vertica/test_dev/tmp_ms/\set except_s2 &amp;#39;&amp;#39;&amp;#39;&amp;#39;:remote_dir&amp;#39;exceptions&amp;#39;&amp;#39;&amp;#39;
\set reject_s2 &amp;#39;&amp;#39;&amp;#39;&amp;#39;:remote_dir&amp;#39;rejections&amp;#39;&amp;#39;&amp;#39;

COPY large_tbl FROM :file1 ON site02 DELIMITER &amp;#39;|&amp;#39;
REJECTED DATA :reject_s2 ON site02
EXCEPTIONS :except_s2 ON site02;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This example uses variables to specify that the exception and rejected data files are on a remote node (indicated by &lt;code&gt;:remote_dir&lt;/code&gt;). The inputs to the COPY statement consist of multiple data files on two nodes (&lt;code&gt;site01&lt;/code&gt; and &lt;code&gt;site02&lt;/code&gt;). The &lt;code&gt;exceptions&lt;/code&gt; and &lt;code&gt;rejected data&lt;/code&gt; options use the &lt;code&gt;ON &lt;/code&gt;&lt;em&gt;&lt;code&gt;nodename&lt;/code&gt;&lt;/em&gt; clause with the variables to indicate where the files reside (&lt;code&gt;site01&lt;/code&gt; and &lt;code&gt;site02&lt;/code&gt;):&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;\set dir `pwd`/data/ \set remote_dir /vertica/test_dev/tmp_ms/
\set except_s1 &amp;#39;&amp;#39;&amp;#39;&amp;#39;:dir&amp;#39;&amp;#39;&amp;#39;&amp;#39;
\set reject_s1 &amp;#39;&amp;#39;&amp;#39;&amp;#39;:dir&amp;#39;&amp;#39;&amp;#39;&amp;#39;
\set except_s2 &amp;#39;&amp;#39;&amp;#39;&amp;#39;:remote_dir&amp;#39;&amp;#39;&amp;#39;&amp;#39;
\set reject_s2 &amp;#39;&amp;#39;&amp;#39;&amp;#39;:remote_dir&amp;#39;&amp;#39;&amp;#39;&amp;#39;
COPY large_tbl FROM :file1 ON site01,
               :file2 ON site01,
               :file3 ON site02,
               :file4 ON site02
               DELIMITER &amp;#39;|&amp;#39;
               REJECTED DATA :reject_s1 ON site01, :reject_s2 ON site02
               EXCEPTIONS :except_s1 ON site01, :except_s2 ON site02;
&lt;/code&gt;&lt;/pre&gt;
      </description>
    </item>
    
    <item>
      <title>Data-Load: COPY LOCAL rejection and exception files</title>
      <link>/en/data-load/handling-messy-data/copy-local-rejection-and-exception-files/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/data-load/handling-messy-data/copy-local-rejection-and-exception-files/</guid>
      <description>
        
        
        &lt;p&gt;Invoking COPY LOCAL (or COPY LOCAL FROM STDIN) does not automatically create rejected data and exceptions files. This behavior differs from using COPY, which saves both files automatically, regardless of whether you use the optional &lt;code&gt;REJECTED DATA&lt;/code&gt; and &lt;code&gt;EXCEPTIONS&lt;/code&gt; parameters to specify either file explicitly.&lt;/p&gt;
&lt;p&gt;Use the &lt;code&gt;REJECTED DATA&lt;/code&gt; and &lt;code&gt;EXCEPTIONS&lt;/code&gt; parameters with COPY LOCAL and COPY LOCAL FROM STDIN to save the corresponding output files on the client. If you do &lt;em&gt;not&lt;/em&gt; use these options, rejected data parsing events (and the exceptions that describe them) are not retained, even if they occur.&lt;/p&gt;
&lt;p&gt;You can load multiple input files using COPY LOCAL (or COPY LOCAL FROM STDIN). If you also use the &lt;code&gt;REJECTED DATA&lt;/code&gt; and &lt;code&gt;EXCEPTIONS&lt;/code&gt; options, the statement writes rejected rows and exceptions and to separate files. The respective files contain all rejected rows and corresponding exceptions, respectively, regardless of how many input files were loaded.&lt;/p&gt;
&lt;p&gt;If COPY LOCAL does not reject any rows, it does not create either file.

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

Because COPY LOCAL (and COPY LOCAL FROM STDIN) must write any rejected rows and exceptions to the client, you cannot use the &lt;code&gt;[ON nodename ]&lt;/code&gt; clause with either the &lt;code&gt;rejected data&lt;/code&gt; or &lt;code&gt;exceptions&lt;/code&gt; options.

&lt;/div&gt;&lt;/p&gt;
&lt;h2 id=&#34;specifying-rejected-data-and-exceptions-files&#34;&gt;Specifying rejected data and exceptions files&lt;/h2&gt;
&lt;p&gt;To save any rejected data and their exceptions to files:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;In the COPY LOCAL (and COPY LOCAL FROM STDIN) statement, use the &lt;code&gt;REJECTED DATA &#39;path&#39;&lt;/code&gt; and the &lt;code&gt;EXCEPTIONS &#39;path&#39;&lt;/code&gt; parameters, respectively.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Specify two different file names for the two options. You cannot use one file for both the &lt;code&gt;REJECTED DATA&lt;/code&gt; and the &lt;code&gt;EXCEPTIONS&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;When you invoke COPY LOCAL or COPY LOCAL FROM STDIN, the files you specify need not pre-exist. If they do, COPY LOCAL must be able to overwrite them.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;You can specify the path and file names with vsql variables:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;\set rejected ../except_reject/copyLocal.rejected
\set exceptions ../except_reject/copyLocal.exceptions
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

Using &lt;code&gt;COPY LOCAL&lt;/code&gt; does not support storing rejected data in a table, as you can when using the &lt;code&gt;COPY&lt;/code&gt; statement.

&lt;/div&gt;
&lt;p&gt;When you use the COPY LOCAL or COPY LOCAL FROM STDIN statement, specify the variable names for the files with their corresponding parameters:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; COPY large_tbl FROM LOCAL rejected data :rejected exceptions :exceptions;
=&amp;gt; COPY large_tbl FROM LOCAL STDIN rejected data :rejected exceptions :exceptions;
&lt;/code&gt;&lt;/pre&gt;
      </description>
    </item>
    
  </channel>
</rss>
