Querying flex tables

After you create your flex table (with or without additional columns) and load data, you can use SELECT, COPY, TRUNCATE, and DELETE as with other tables.

After you create your flex table (with or without additional columns) and load data, you can use SELECT, COPY, TRUNCATE, and DELETE as with other tables. You can use SELECT queries for virtual columns that exist in the __raw__ column and other real columns in your flex tables. Field names in the __raw__ column are case-insensitive, as are the names of real columns.

Unsupported DDL and DML statements

You cannot use the following DDL and DML statements with flex tables:

  • CREATE TABLE...AS...

  • CREATE TABLE...LIKE...

  • SELECT INTO

  • UPDATE

  • MERGE

Determining flex table data contents

If you do not know what your flex table contains, two helper functions explore the VMap data to determine its contents. Use these functions to compute the keys in the flex table __raw__ column and, optionally, build a view based on those keys:

For more information about these and other helper functions, see Flex data functions.

To determine what key value pairs exist as virtual columns:

  1. Call the function as follows:

    => SELECT compute_flextable_keys('darkdata');
                compute_flextable_keys
    --------------------------------------------------
     Please see public.darkdata_keys for updated keys(1 row)
    
  2. View the key names by querying the darkdata_keys table:

    => SELECT * FROM darkdata_keys;
    
                             key_name                    | frequency |   data_type_guess
    -----------------------------------------------------+-----------+-------------
     contributors                                        |         8 | varchar(20)
     coordinates                                         |         8 | varchar(20)
     created_at                                          |         8 | varchar(60)
     entities.hashtags                                   |         8 | long varbinary(186)
     .
     .
     retweeted_status.user.time_zone                     |         1 | varchar(20)
     retweeted_status.user.url                           |         1 | varchar(68)
     retweeted_status.user.utc_offset                    |         1 | varchar(20)
     retweeted_status.user.verified                      |         1 | varchar(20)
    (125 rows)
    

Querying virtual columns

Continuing with the JSON data example, use a SELECT statement query to explore content from the virtual columns. Then, analyze what is most important to you in case you want to materialize any virtual columns. This example shows how to query some common virtual columns in the VMap data:

=> SELECT "user.name", "user.lang", "user.geo_enabled" FROM darkdata1;
      user.name      | user.lang | user.geo_enabled
---------------------+-----------+------------------
 laughing at clouds. | it        | T
 Avita Desai         | en        | F
 I'm Toaster⥠      | es        | T
                     |           |
                     |           |
                     |           |
 Uptown gentleman.   | en        | F
 ~G A B R I E L A â¿ | en        | F
 Flu Beach           | es        | F
                     |           |
 seydo shi           | tr        | T
 The End             | en        | F
(12 rows)

Querying flex table keys

If you reference an undefined column ('which_column') in a flex table query, Vertica converts the query to a call to the MAPLOOKUP function as follows:

MAPLOOKUP(__raw__, 'which_column')

The MAPLOOKUP function searches the VMap data for the requested key and returns the following information:

  • String values associated with the key for a row.

  • NULL if the key is not found.

For more information about handling NULL values, see MAPCONTAINSKEY().

Using functions and casting in flex table queries

You can cast the virtual columns as required and use functions in your SELECT statement queries. The next example uses a SELECT statement to query the created_at and retweet_count virtual columns, and to cast their values in the process:

=> SELECT "created_at"::TIMESTAMP, "retweet_count"::INT FROM darkdata1 ORDER BY 1 DESC;
     created_at      | retweet_count
---------------------+---------------
 2012-10-15 14:41:05 |             0
 2012-10-15 14:41:05 |             0
 2012-10-15 14:41:05 |             0
 2012-10-15 14:41:05 |             0
 2012-10-15 14:41:05 |             0
 2012-10-15 14:41:05 |             0
 2012-10-15 14:41:05 |             0
 2012-10-15 14:41:04 |             1
                     |
                     |
                     |
                     |
(12 rows)

The following query uses the COUNT and AVG functions to determine the average length of text in different languages:

=> SELECT "user.lang", count (*), avg(length("text"))::int
   FROM darkdata1 GROUP BY 1 ORDER BY 2 DESC;
 user.lang | count | avg
-----------+-------+-----
 en        |     4 |  42
           |     4 |
 es        |     2 |  96
 it        |     1 |  50
 tr        |     1 |  16
(5 rows)

Casting data types in a query

The following query requests the values of the created_at virtual column, without casting to a specific data type:

=> SELECT "created_at" FROM darkdata1;
           created_at
--------------------------------
 Mon Oct 15 18:41:04 +0000 2012
 Mon Oct 15 18:41:05 +0000 2012
 Mon Oct 15 18:41:05 +0000 2012
 Mon Oct 15 18:41:05 +0000 2012
 Mon Oct 15 18:41:05 +0000 2012
 Mon Oct 15 18:41:05 +0000 2012
 Mon Oct 15 18:41:05 +0000 2012
 Mon Oct 15 18:41:05 +0000 2012
(12 rows)

The next example queries the same virtual column, casting created_at to a TIMESTAMP. Casting results in different output and the regional time:

=> SELECT "created_at"::TIMESTAMP FROM darkdata1 ORDER BY 1 DESC;
     created_at
---------------------
 2012-10-15 14:41:05
 2012-10-15 14:41:05
 2012-10-15 14:41:05
 2012-10-15 14:41:05
 2012-10-15 14:41:05
 2012-10-15 14:41:05
 2012-10-15 14:41:05
 2012-10-15 14:41:04

Accessing an epoch key

The term EPOCH (all uppercase letters) is reserved in Vertica for internal use.

If your JSON data includes a virtual column called epoch, you can query it within your flex table. However, use the MAPLOOKUP function to do so.