FCSVPARSER
Parses CSV format (comma-separated values) data. Use this parser to load CSV data into columnar, flex, and hybrid tables. All data must be encoded in Unicode UTF-8 format. The fcsvparser
parser supports the RFC 4180 standard for CSV data, and other options, to accommodate variations in CSV file format definitions. Invalid records are rejected. For more information about data formats, see Handling Non-UTF-8 input.
This parser is for use in Flex tables only. All flex parsers store the data as a single VMap in the LONG VARBINAR_raw__
column. If a data row is too large to fit in the column, it is rejected. Vertica supports null values for loading data with NULL-specified columns.
Syntax
FCSVPARSER ( [parameter='value'[,...]] )
Parameters
type
- The default parameter values for the parser, one of the following strings:
-
rfc4180
-
traditional
You do not have to use the type parameter when loading data that conforms to the RFC 4180 standard (such as MS Excel files). See Loading CSV data for the
RFC4180
default parameters, and other options you can specify for traditional CSV files.Default:
RFC4180
-
delimiter
- A single-character value used to separate fields in the CSV data.
Default:
,
(forrfc4180
andtraditional
) escape
- A single-character value used as an escape character to interpret the next character in the data literally.
Default:
-
rfc4180
:"
-
traditional
:\
-
enclosed_by
- A single-character value. Use
enclosed_by
to include a value that is identical to the delimiter, but should be interpreted literally. For example, if the data delimiter is a comma (,
), and you want to use a comma within the data ("my name is jane, and his is jim"
).Default:
"
record_terminator
- A single-character value used to specify the end of a record.
Default:
-
rfc4180
:\n
-
traditional
:\r\n
-
header
- Boolean, specifies whether to use the first row of data as a header column. When
header=true
(default), and no header exists, fcsvparser uses a default column heading. The default header consists ofucol
n
, where n is the column offset number, starting with0
for the first column. You can specify custom column heading names using theheader_names
parameter, described next.If you specify
header=false
, thefcsvparser
parses the first row of input as data, rather than as column headers.Default: true
header_names
- A list of column header names, delimited by the character defined by the parser's delimiter parameter. Use this parameter to specify header names in a CSV file without a header row, or to override the column names present in the CSV source. To override one or more existing column names, specify the header names to use. This parameter overrides any header row in the data.
trim
- Boolean, specifies whether to trim white space from header names and key values.
Default: true
omit_empty_keys
- Boolean, specifies how the parser handles header keys without values. If true, keys with an empty value in the
header
row are not loaded.Default: false
reject_on_duplicate
- Boolean, specifies whether to ignore duplicate records (false), or to reject duplicates (true). In either case, the load continues.
Default: false
reject_on_empty_key
- Boolean, specifies whether to reject any row containing a key without a value.
Default: false
reject_on_materialized_type_error
- Boolean, specifies whether to reject any materialized column value that the parser cannot coerce into a compatible data type. See Loading CSV data.
Default: false
Examples
The following example loads data that complies with RFC 4180:
Note the value 20"5
, which came from the input "20""5"
. Per RFC 4180, to include a double-quote character in a string, you must double-quote the entire string and then escape the double-quote with another double-quote.
For more information and examples using other parameters of this parser, see Loading CSV data.