Creating flex tables

You can create a flex table or an external flex table without column definitions or other parameters.

You can create a flex table or an external flex table without column definitions or other parameters. You can use any CREATE TABLE statement parameters you prefer, as usual.

Creating basic flex tables

Here's how to create the table:

=> CREATE FLEX TABLE darkdata();
CREATE TABLE

Selecting from the table before loading any data into it reveals its two real columns, __identity__ and __raw__:

=> SELECT * FROM darkdata;
__identity__    | __raw__
--------------+---------
(0 rows)

Below is an example of creating a flex table with a column definition:

=> CREATE FLEX TABLE darkdata1(name VARCHAR);
CREATE TABLE

When flex tables exist, you can add new columns (including those with default derived expressions), as described in Materializing Flex Tables.

Materializing flex table virtual columns

After you create your flex table and load data, you compute keys from virtual columns. After completing those tasks, you can materialize some keys by promoting virtual columns to real table columns. By promoting virtual columns, you query real columns rather than the raw data.

You can promote one or more virtual columns — materializing those keys from within the __raw__ data to real columns. Vertica recommends this approach to get the best query performance for all important keys. You don't need to create new columnar tables from your flex table.

Materializing flex table columns results in a hybrid table. Hybrid tables:

  • Maintain the convenience of a flex table for loading unstructured data

  • Improve query performance for any real columns

If you have only a few columns to materialize, try altering your flex table progressively, adding columns whenever necessary. You can use the ALTER TABLE...ADD COLUMN statement to do so, just as you would with a columnar table. See Materializing flex tables for ideas about adding columns.

If you want to materialize columns automatically, use the helper function MATERIALIZE_FLEXTABLE_COLUMNS.

Creating columnar tables from flex tables

You can create a regular Vertica table from a flex table, but you cannot use one flex table to create another.

Typically, you create a columnar table from a flex table after loading data. Then, you specify the virtual column data you want in a regular table, casting virtual columns to regular data types.

To create a columnar table from a flex table, darkdata, select two virtual columns, (user.lang and user.name), for the new table. Use a command such as the following, which casts both columns to varchars for the new table:

=> CREATE TABLE darkdata_full AS SELECT "user.lang"::VARCHAR, "user.name"::VARCHAR FROM darkdata;
CREATE TABLE
=> SELECT * FROM darkdata_full;
 user.lang |      user.name
-----------+---------------------
 en        | Frederick Danjou
 en        | The End
 en        | Uptown gentleman.
 en        | ~G A B R I E L A â¿
 es        | Flu Beach
 es        | I'm Toasterâ¥
 it        | laughing at clouds.
 tr        | seydo shi
           |
           |
           |
           |
(12 rows)

Creating temporary flex tables

Before you create temporary global and local flex tables, be aware of the following considerations:

  • GLOBAL TEMP flex tables are supported. Creating a temporary global flex table results in the flextable_keys table and the flextable_view having temporary table restrictions for their content.

  • LOCAL TEMP flex tables must include at least one column definition. The reason for this requirement is that local temp tables do not support automatically-incrementing data (such as the flex table default __identity__ column). Creating a temporary local flex table results in the flextable_keys table and the flextable_view existing in the local temporary object scope.

  • LOCAL TEMP views are supported for flex and columnar temporary tables.

For global or local temp flex tables to function correctly, you must also specify the ON COMMIT PRESERVE ROWS clause. You must use the ON COMMIT clause for the flex table helper functions, which rely on commits. Create a local temp table as follows:

=> CREATE FLEX LOCAL TEMP TABLE good(x int) ON COMMIT PRESERVE ROWS;
CREATE TABLE

After creating a local temporary flex table using this approach, you can then load data into the table, create table keys, and a flex table view:

=> COPY good FROM '/home/release/KData/bake.json' PARSER fjsonparser();
 Rows Loaded
-------------
           1
(1 row)

=> select compute_flextable_keys_and_build_view('good');
                            compute_flextable_keys_and_build_view
----------------------------------------------------------------------------------------------
 Please see v_temp_schema.good_keys for updated keys
The view good_view is ready for querying
(1 row)

Similarly, you can create global temp tables as follows:

=> CREATE FLEX GLOBAL TEMP TABLE good_global(x int) ON COMMIT PRESERVE ROWS;

After creating a global temporary flex table using this approach, you can then load data into the table, create table keys, and a flex table view:

=> COPY good_global FROM '/home/dbadmin/data/flex/bake_single.json' PARSER fjsonparser();
Rows Loaded
-------------
5
(1 row)

=> SELECT compute_flextable_keys_and_build_view('good_global');
                            compute_flextable_keys_and_build_view
----------------------------------------------------------------------------------------------
 Please see v_temp_schema.good_keys for updated keys
The view good_view is ready for querying
(1 row)

Creating external flex tables

To create an external flex table:

=> CREATE flex external table mountains() AS COPY FROM 'home/release/KData/kmm_ountains.json' PARSER fjsonparser();
CREATE TABLE

As with other flex tables, creating an external flex table produces two regular tables: the named table and its associated _keys table. The keys table is not an external table:

=> \dt mountains
                 List of tables
 Schema |   Name    | Kind  |  Owner  | Comment
--------+-----------+-------+---------+---------
 public | mountains | table | release |
(1 row)

You can use the helper function, COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW, to compute keys and create a view for the external table:

=> SELECT compute_flextable_keys_and_build_view ('appLog');

                     compute_flextable_keys_and_build_view
--------------------------------------------------------------------------------------------------
Please see public.appLog_keys for updated keys
The view public.appLog_view is ready for querying
(1 row)

Check the keys from the _keys table for the results of running the helper application:

=> SELECT * FROM appLog_keys;
                          key_name                       | frequency |   data_type_guess
----------------------------------------------------------+-----------+------------------
contributors                                             |         8 | varchar(20)
coordinates                                              |         8 | varchar(20)
created_at                                               |         8 | varchar(60)
entities.hashtags                                        |         8 | long varbinary(186)
.
.
.
retweeted_status.user.time_zone                          |         1 | varchar(20)
retweeted_status.user.url                                |         1 | varchar(68)
retweeted_status.user.utc_offset                         |         1 | varchar(20)
retweeted_status.user.verified                           |         1 | varchar(20)
(125 rows)

You can query the view:

=> SELECT "user.lang" FROM appLog_view;
 user.lang
-----------
it
en
es
en
en
es
tr
en
(12 rows)

Creating a flex table from query results

You can use the CREATE FLEX TABLE AS statement to create a flex table from the results of a query.

You can use this statement to create three types of flex tables:

  • Flex table with no materialized columns

  • Flex table with some materialized columns

  • Flex table with all materialized columns


When a flex __raw__ column is present in the CTAS query, the entire source VMap is carried to the flex table. If the query has matching column names, the key values are overridden.

Examples

Creating a flex table with no materialized columns from a regular table causes the results of the query to be stored in the __raw__ column as a VMap.

  1. Create a regular table named pets with two columns:

    => CREATE TABLE pets(age INT, name VARCHAR);
    CREATE TABLE
    
  2. Create a flex table named family_pets by using the CTAS statement to copy the columns age and name from the pets:

    => CREATE FLEX TABLE family_pets() AS SELECT age, name FROM pets;
    CREATE TABLE
    
  3. View the new flex table to confirm the operation has been successful and that the columns age and name have not been materialized.

    => \d family_pets;
        List of Fields by Tables
    Schema | Table       |    Column    |          Type          |  Size  | Default | Not Null | Primary Key | Foreign Key
    --------+------------+--------------+------------------------+--------+---------+----------+-------------+-------------
    public | family_pets | __identity__ | int                    |      8 |         | t        | f           |
    public | family_pets | __raw__      | long varbinary(130000) | 130000 |         | t        | f           |
    (2 rows)
    

    You can create a flex table with no materialized columns from the results of a query of another flex table. This inserts all the VMaps from the source flex table into the target. This creates a flex table segmented and ordered by the __identity__ column.

  4. Create a flex table named city_pets by using the CTAS statement to copy the age and __raw__ columns from family_pets:

    => CREATE FLEX TABLE city_pets() AS SELECT age, __raw__ FROM family_pets;
    CREATE TABLE
    
  5. View the new flex table to confirm that the operation has been successful and the columnsageand __raw__ have not been materialized.

    => SELECT * FROM city_pets;
        List of Fields by Tables
    Schema | Table     |    Column    |          Type          |  Size  | Default | Not Null | Primary Key | Foreign Key
    --------+----------+--------------+------------------------+--------+---------+----------+-------------+-------------
    public | city_pets | __identity__ | int                    |      8 |         | t        | f           |
    public | city_pets | __raw__      | long varbinary(130000) | 130000 |         | t        | f           |
    (2 rows)
    

    You can create a flex table with some materialized columns. This uses a syntax similar to the syntax for creating columnar tables with some materialized columns. Unlike columnar tables, however, you need to match the number of columns with the columns that are returned by the query. In the following example, our query returns three columns (amount, type, and available), but Vertica only materializes the first two.

  6. Create a table named animals with three columns, amount, type, and available:

    => CREATE TABLE animals(amount INT, type VARCHAR, available DATE);
    
  7. Create a flex table named inventory with columns animal_amount and animal_type using the CTAS statement to copy columns amount, type, and available from animals.

    => CREATE FLEX TABLE inventory(animal_amount, animal_type) AS SELECT amount, type, available FROM animals;
    CREATE TABLE
    
  8. View the table data to confirm that columns amount and type have been materialized under the column names animal_amount and animal_type. Column available from animals has also been copied over but was not materialized:

    => \d inventory
        List of Fields by Tables
    Schema | Table | Column           |          Type          |  Size  | Default | Not Null | Primary Key | Foreign Key
    --------+-------+-----------------+------------------------+--------+---------+---------+-------------+-------------
    public | flex3 | __raw__          | long varbinary(130000) | 130000 |         | t       | f           |
    public | flex3 | animal_amount    | int                    |      8 |         | f       | f           |
    public | flex3 | animal_type      | varchar(80)            |     80 |         | f       | f           |
    (3 rows)
    

    Notice that including empty parentheses in the statement results in a flex table with no materialized columns:

  9. Create a flex table named animals_for_sale using the CTAS statement with empty parentheses to copy columns amount, type, and available from animals into a pure flex table:

    => CREATE FLEX TABLE animals__for_sale() AS SELECT amount, type, available FROM animals;
    CREATE TABLE
    
  10. View the table data to confirm that no columns were materialized:

    =>\d animals_for_sale;
        List of Fields by Tables
    Schema | Table            |    Column    |          Type          |  Size  | Default | Not Null | Primary Key | Foreign Key
    --------+-----------------+--------------+------------------------+--------+---------+----------+------------+-------------
    public | animals_for_sale | __identity__ | int                    |      8 |         | t        | f          |
    public | animals_for_sale | __raw__      | long varbinary(130000) | 130000 |         | t        | f          |
    (2 rows)
    

    Omitting any parentheses in the statement causes all columns to be materialized:

  11. Create a flex table named animals_sold using the CTAS statement without parentheses. This copies columns amount, type, and available from animals and materialize all columns:

    => CREATE FLEX TABLE animals_sold AS SELECT amount, type, available FROM animals;
    CREATE TABLE
    
  12. View the table data to confirm that all columns were materialized:

    => \d animals_sold;
        List of Fields by Tables
    Schema | Table        | Column    |          Type          |  Size  | Default | Not Null | Primary Key | Foreign Key
    --------+-------------+-----------+------------------------+--------+---------+----------+-------------+-------------
    public | animals_sold | __raw__   | long varbinary(130000) | 130000  |         | t        | f           |
    public | animals_sold | amount    | int                    |      8  |         | f        | f           |
    public | animals_sold | type      | varchar(80)            |     80  |         | f        | f           |
    public | animals_sold | available | date                   |      8  |         | f        | f           |
    (4 rows)