Loading sample data

OpenText Analytics Database offers several solutions for loading files with structured and unstructred data, and from several formats.

OpenText™ Analytics Database offers several solutions for loading files with structured and unstructred data, and from several formats.

Creating a sample data file

Create a sample CSV file called cities.csv with the following contents and save it to /home/dbadmin/cities.csv.

City,State,Zip,Population
Boston,MA,02108,694583
Chicago,IL,60601,2705994
Seattle,WA,98101,744955
Dallas,TX,75201,1345047
New York,NY,10001,8398748

Loading structured data from a file

  1. Run vsql.

    $ vsql
    
  2. Create the cities table.

    => CREATE TABLE cities (
    city       varchar(20),
    state      char(2),
    zip        int,
    population int
    );
    
  3. Use the COPY statement to load the data from the cities.csv file. The following command logs exceptions and rejections in the home directory.

    => COPY cities FROM LOCAL '/home/dbadmin/cities.csv' DELIMITER ',' NULL '' EXCEPTIONS '/home/dbadmin/cities_exceptions.log'
    REJECTED DATA '/home/dbadmin/cities_rejections.log';
    
  4. Review the rejections log for what data was excluded. Here, the header was excluded.

    $ cat /home/dbadmin/cities_rejections.log
    
    City,State,Zip,Population
    
  5. Review the exceptions for details on the error. In this case, the header failed the database's integer data type verification.

    $ cat /home/dbadmin/cities_exceptions.log
    
    COPY: Input record 1 has been rejected (Invalid integer format 'Zip' for column 3 (zip)).
    Please see /home/dbadmin/cities_rejections.log, record 1 for the rejected record. This record was record 1 from cities.csv
    
  6. To fix this, add SKIP 1 to the original COPY statement. This excludes the first row.

    => COPY cities FROM LOCAL '/home/dbadmin/cities.csv' DELIMITER, 'NULL'
    EXCEPTIONS '/home/dbadmin/cities_exceptions.log'
    REJECTED DATA '/home/dbadmin/cities_rejections.log' SKIP 1;
    

Loading unstructured data with flex tables

To load data from another source, the database uses Flex tables. Flex tables simplify data loading by allowing you to load unstructured or "semi-structured" data without having to create a schema or column definitions.

Supported formats include:

  • Avro Data

  • CEF

  • CSV

  • Delimited

  • JSON

  1. Create a table called cities_flex. Notice how it does not include column names or data types.

    => CREATE FLEXIBLE TABLE cities_flex();
    
  2. Load the CSV file into the table.

    => COPY cities_flex FROM '/source/cities.csv' PARSER FDELIMITEDPARSER (delimiter=',');
    
  3. Query the cities_flex table, specifying the column names from the original CSV file.

    => SELECT city, state FROM cities_flex;