Loading sample data
Vertica offers several solutions for loading files with structured and unstructred data, and from several formats.
Creating a sample data file
Create a sample CSV file called cities.csv
with the following contents and save it to /home/dbadmin/cities.csv.
City,State,Zip,Population
Boston,MA,02108,694583
Chicago,IL,60601,2705994
Seattle,WA,98101,744955
Dallas,TX,75201,1345047
New York,NY,10001,8398748
Loading structured data from a file
-
Run vsql.
$ vsql
-
Create the cities table.
=> CREATE TABLE cities ( city varchar(20), state char(2), zip int, population int );
-
Use the COPY statement to load the data from the
cities.csv
file. The following command logs exceptions and rejections in the home directory.=> COPY cities FROM LOCAL '/home/dbadmin/cities.csv' DELIMITER ',' NULL '' EXCEPTIONS '/home/dbadmin/cities_exceptions.log' REJECTED DATA '/home/dbadmin/cities_rejections.log';
-
Review the rejections log for what data was excluded. Here, the header was excluded.
$ cat /home/dbadmin/cities_rejections.log City,State,Zip,Population
-
Review the exceptions for details on the error. In this case, the header failed Vertica's integer data type verification.
$ cat /home/dbadmin/cities_exceptions.log COPY: Input record 1 has been rejected (Invalid integer format 'Zip' for column 3 (zip)). Please see /home/dbadmin/cities_rejections.log, record 1 for the rejected record. This record was record 1 from cities.csv
-
To fix this, add SKIP 1 to the original COPY statement. This excludes the first row.
=> COPY cities FROM LOCAL '/home/dbadmin/cities.csv' DELIMITER, 'NULL' EXCEPTIONS '/home/dbadmin/cities_exceptions.log' REJECTED DATA '/home/dbadmin/cities_rejections.log' SKIP 1;
Loading unstructured data with flex tables
To load data from another source, Vertica uses Flex tables. Flex tables simplify data loading by allowing you to load unstructured or "semi-structured" data without having to create a schema or column definitions.
Supported formats include:
-
Avro Data
-
CEF
-
CSV
-
Delimited
-
JSON
-
Create a table called cities_flex. Notice how it does not include column names or data types.
=> CREATE FLEXIBLE TABLE cities_flex();
-
Load the CSV file into the table.
=> COPY cities_flex FROM '/source/cities.csv' PARSER FDELIMITEDPARSER (delimiter=',');
-
Query the cities_flex table, specifying the column names from the original CSV file.
=> SELECT city, state FROM cities_flex;