Loading columnar tables with flex parsers

You can use any of the flex parsers to load data into columnar tables.

You can use any of the flex parsers to load data into columnar tables. Using the flex table parsers to load columnar tables gives you the capability to mix data loads in one table. For example, you can load JSON data in one session and delimited data in another.

The following basic examples illustrate how you can use flex parsers with columnar tables.

  1. Create a columnar table, super, with two columns, age and name:

    => CREATE TABLE super(age INT, name VARCHAR);
    CREATE TABLE
    
  2. Enter JSON values from STDIN, using the fjsonparser():

    => COPY super FROM stdin PARSER fjsonparser();
    Enter data to be copied followed by a newline.
    End with a backslash and a period on a line by itself.
    >> {"age": 5, "name": "Tim"}
    >>  {"age": 3}
    >>  {"name": "Fred"}
    >>  {"name": "Bob", "age": 10}
    >> \.
    
  3. Query the table to see the values you entered:

    
    => SELECT * FROM super;
     age | name
    -----+------
         | Fred
      10 | Bob
       5 | Tim
       3 |
    (4 rows)
    
  4. Enter some delimited values from STDIN, using the fdelimitedparser():

    => COPY super FROM stdin PARSER fdelimitedparser();
    Enter data to be copied followed by a newline.
    End with a backslash and a period on a line by itself.
    >> name |age
    >> Tim|50
    >> |30
    >> Fred|
    >> Bob|100
    >> \.
    
  5. Query the flex table. Both JSON and delimited data are saved in the same columnar table, super.

    => SELECT * FROM super;
     age | name
    -----+------
      50 | Tim
      30 |
       3 |
       5 | Tim
     100 | Bob
         | Fred
      10 | Bob
         | Fred
    (8 rows)
    

Use the reject_on_materialized_type_error parameter to avoid loading data with type mismatch. If reject_on_materialized_type_error is set to false, the flex parser will accept the data with type mismatch. Consider the following example:

Assume that the CSV file to be loaded has the following sample contents:

$ cat json.dat
{"created_by":"system","site_source":"flipkart_india_kol","updated_by":"system1","invoice_id":"INVDPKOL100",
"vendor_id":"VEN15731","total_quantity":12,"created_at":"2012-01-09 23:15:52.0"}
{"created_by":"system","site_source":"flipkart_india_kol","updated_by":"system2","invoice_id":"INVDPKOL101",
"vendor_id":"VEN15732","total_quantity":14,"created_at":"hello"}
  1. Create a columnar table.

    => CREATE TABLE hdfs_test (
    site_source VARCHAR(200),
    total_quantity int ,
    vendor_id varchar(200),
    invoice_id varchar(200),
    updated_by varchar(200),
    created_by varchar(200),
    created_at timestamp
    );
    
  2. Load JSON data.

    =>COPY hdfs_test FROM '/home/dbadmin/json.dat' PARSER fjsonparser() ABORT ON ERROR;
    Rows Loaded
    -------------
    2
    (1 row)
    
  3. View the contents.

    => SELECT * FROM hdfs_test;
    site_source | total_quantity | vendor_id | invoice_id | updated_by | created_by | created_at
    --------------------+----------------+-----------+-------------+------------+------------+---------------------
    flipkart_india_kol | 12 | VEN15731 | INVDPKOL100 | system1 | system | 2012-01-09 23:15:52
    flipkart_india_kol | 14 | VEN15732 | INVDPKOL101 | system2 | system |
    (2 rows)
    
  4. If reject_on_materialized_type_error parameter is set to true, you will receive errors when loading the sample JSON data.

    => COPY hdfs_test FROM '/home/dbadmin/data/flex/json.dat' PARSER fjsonparser(reject_on_materialized_type_error=true) ABORT ON ERROR;
    ERROR 2035:  COPY: Input record 2 has been rejected (Rejected by user-defined parser)