This section contains helper functions for use in working with flex tables and flexible columns for complex types. You can use these functions with flex tables, their associated flex_table_keys tables and flex_table_view views, and flexible columns in external tables. These functions do not apply to other tables.
For more information about flex tables, see Flex tables. For more information about flexible columns for complex types, see Flexible complex types.
Flex functions allow you to manage and query flex tables. You can also use the map functions to query flexible complex-type columns in non-flex tables.
1 - Flex data functions
The flex table data helper functions supply information you need to directly query data in flex tables.
The flex table data helper functions supply information you need to directly query data in flex tables. After you compute keys and create views from the raw data, you can use field names directly in queries instead of using map functions to extract data. The fata functions are:
COMPUTE_FLEXTABLE_KEYS: Computes map keys from the map data in a flex table and populates a keys table with the results. Use this function before building a view.
BUILD_FLEXTABLE_VIEW: Uses the keys in a table to create a view definition for the source table. Use this function after computing flex table keys.
RESTORE_FLEXTABLE_DEFAULT_KEYS_TABLE_AND_VIEW: Replaces the flextable_data_keys table and the flextable_data_view view, linking both the keys table and the view to the parent flex table.
Flex table dependencies
Each flex table has two dependent objects, a keys table and a view. While both objects are dependent on their parent table, you can drop either object independently. Dropping the parent table removes both dependents, without a CASCADE option.
Associating flex tables and views
The helper functions automatically use the dependent table and view if they are internally linked with the parent table. You create both when you create the flex table. You can drop either the keys table or the view and re-create objects of the same name. However, if you do so, the new objects are not internally linked with the parent flex table.
In this case, you can restore the internal links of these objects to the parent table. To do so, drop the keys table and the view before calling the RESTORE_FLEXTABLE_DEFAULT_KEYS_TABLE_AND_VIEW function. Calling this function re-creates the keys table and view.
The remaining helper functions perform the tasks described in this section.
1.1 - BUILD_FLEXTABLE_VIEW
Creates, or re-creates, a view for a default or user-defined keys table, ignoring any empty keys.
Creates, or re-creates, a view for a default or user-defined keys table, ignoring any empty keys.
Note
If the length of a key exceeds 65,000, Vertica truncates the key.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Database and schema. The default schema is public. If you specify a database, it must be the current database.
flex-table
The flex table name. By default, this function builds or rebuilds a view for the input table with the current contents of the associated flex_table_keys table.
view-name
A custom view name. Use this option to build a new view for flex-table with the name you specify.
user-keys-table
Name of a keys table from which to create the view. Use this option if you created a custom keys table from the flex table map data, rather than from the default flex_table_keys table. The function builds a view from the keys in user_keys, rather than from flex_table_keys.
Examples
The following examples show how to call BUILD_FLEXTABLE_VIEW with 1, 2, or 3 arguments.
To create, or re-create, a default view:
Call the function with an input flex table:
=> SELECT BUILD_FLEXTABLE_VIEW('darkdata');
build_flextable_view
-----------------------------------------------------
The view public.darkdata_view is ready for querying
(1 row)
The function creates a view with the default name (darkdata_view) from the darkdata_keys table.
To create, or re-create, a view with a custom name:
Call the function with two arguments, an input flex table, darkdata, and the name of the view to create, dd_view:
=> SELECT BUILD_FLEXTABLE_VIEW('darkdata', 'dd_view');
build_flextable_view
-----------------------------------------------
The view public.dd_view is ready for querying
(1 row)
Query a key name (user.lang) from the new or updated view (dd_view):
=> SELECT "user.lang" FROM dd_view;
user.lang
-----------
tr
en
es
en
en
it
es
en
(12 rows)
To create a view from a custom keys table with BUILD_FLEXTABLE_VIEW, the custom table must have the same schema and table definition as the default table (darkdata_keys). Create a custom keys table, using any of these three approaches:
Create a columnar table with all keys from the default keys table for a flex table (darkdata_keys):
=> CREATE TABLE new_darkdata_keys AS SELECT * FROMdarkdata_keys;
CREATE TABLE
Create a columnar table without content (LIMIT 0) from the default keys table for a flex table (darkdata_keys):
=> CREATE TABLE new_darkdata_keys AS SELECT * FROM darkdata_keys LIMIT 0;
CREATE TABLE
kdb=> SELECT * FROM new_darkdata_keys;
key_name | frequency | data_type_guess
----------+-----------+-----------------
(0 rows)
Create a columnar table without content (LIMIT 0) from the default keys table, and insert two values ('user.lang', 'user.name') into the key_name column:
=> CREATE TABLE dd_keys AS SELECT * FROM darkdata_keys limit 0;
CREATE TABLE
=> INSERT INTO dd_keys (key_name) values ('user.lang');
OUTPUT
--------
1
(1 row)
=> INSERT INTO dd_keys (key_name) values ('user.name');
OUTPUT
--------
1
(1 row)
=> SELECT * FROM dd_keys;
key_name | frequency | data_type_guess
-----------+-----------+-----------------
user.lang | |
user.name | |
(2 rows)
After creating a custom keys table, call BUILD_FLEXTABLE_VIEW with all arguments (an input flex table, the new view name, the custom keys table):
=> SELECT BUILD_FLEXTABLE_VIEW('darkdata', 'dd_view', 'dd_keys');
build_flextable_view
-----------------------------------------------
The view public.dd_view is ready for querying
(1 row)
Computes the virtual columns (keys and values) from flex table VMap data.
Computes the virtual columns (keys and values) from flex table VMap data. Use this function to compute keys without creating an associated table view. To also build a view, use COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Database and schema. The default schema is public. If you specify a database, it must be the current database.
flex-table
Name of the flex table.
Output
The function stores its results in a table named flex-table_keys. The table has the following columns:
Column
Description
KEY_NAME
The name of the virtual column (key). Keys larger than 65,000 bytes are truncated.
FREQUENCY
The number of times the key occurs in the VMap.
DATA_TYPE_GUESS
Estimate of the data type for the key based on the non-null values found in the VMap. The function determines the type of each non-string value, depending on the length of the key, and whether the key includes nested maps. If the EnableBetterFlexTypeGuessing configuration parameter is 0 (OFF), this function instead treats all flex table keys as string types ([LONG] VARCHAR or [LONG] VARBINARY).
COMPUTE_FLEXTABLE_KEYS sets the column width for keys to the length of the largest value for each key multiplied by the FlexTableDataTypeGuessMultiplier factor.
Examples
In the following example, JSON data with consistent fields has been loaded into a flex table. Had the data been more varied, you would see different numbers of occurrences in the keys table:
Combines the functionality of BUILD_FLEXTABLE_VIEW and COMPUTE_FLEXTABLE_KEYS to compute virtual columns (keys) from the VMap data of a flex table and construct a view.
Combines the functionality of BUILD_FLEXTABLE_VIEW and COMPUTE_FLEXTABLE_KEYS to compute virtual columns (keys) from the VMap data of a flex table and construct a view. Creating a view with this function ignores empty keys. If you do not need to perform both operations together, use one of the single-operation functions instead.
Note
If the length of a key exceeds 65,000, Vertica truncates the key.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
This example shows how to call the function for the darkdata flex table.
=> SELECT COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW('darkdata');
compute_flextable_keys_and_build_view
-----------------------------------------------------------------------
Please see public.darkdata_keys for updated keys
The view public.darkdata_view is ready for querying
(1 row)
Materializes virtual columns listed as key_names in the flextable_keys table you compute using either COMPUTE_FLEXTABLE_KEYS or COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW.
Each column that you materialize with this function counts against the data storage limit of your license. To check your Vertica license compliance, call the AUDIT() or AUDIT_FLEX() functions.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Database and schema. The default schema is public. If you specify a database, it must be the current database.
flex-table
The name of the flex table with columns to materialize. The function:
Skips any columns already materialized
Ignores any empty keys
n-columns
The number of columns to materialize, up to 9800. The function attempts to materialize the number of columns from the keys table, skipping any columns already materialized. It orders the materialized results by frequency, descending. If not specified, the default is a maximum of 50 columns.
keys-table-name
The name of a keys from which to materialize columns. The function:
Materializes n-columns columns from the keys table
Skips any columns already materialized
Orders the materialized results by frequency, descending
Examples
The following example shows how to call MATERIALIZE_FLEXTABLE_COLUMNS to materialize columns. First, load a sample file of tweets (tweets_10000.json) into the flex table twitter_r. After loading data and computing keys for the sample flex table, call MATERIALIZE_FLEXTABLE_COLUMNS to materialize the first four columns:
=> COPY twitter_r FROM '/home/release/KData/tweets_10000.json' parser fjsonparser();
Rows Loaded
-------------
10000
(1 row)
=> SELECT compute_flextable_keys ('twitter_r');
compute_flextable_keys
---------------------------------------------------
Please see public.twitter_r_keys for updated keys
(1 row)
=> SELECT MATERIALIZE_FLEXTABLE_COLUMNS('twitter_r', 4);
MATERIALIZE_FLEXTABLE_COLUMNS
-------------------------------------------------------------------------------
The following columns were added to the table public.twitter_r:
contributors
entities.hashtags
entities.urls
For more details, run the following query:
SELECT * FROM v_catalog.materialize_flextable_columns_results WHERE table_schema = 'public' and table_name = 'twitter_r';
(1 row)
The last message in the example recommends querying the MATERIALIZE_FLEXTABLE_COLUMNS_RESULTS system table for the results of materializing the columns, as shown:
=> SELECT * FROM v_catalog.materialize_flextable_columns_results WHERE table_schema = 'public' and table_name = 'twitter_r';
table_id | table_schema | table_name | creation_time | key_name | status | message
-------------------+--------------+------------+------------------------------+-------------------+--------+---------------------
45035996273733172 | public | twitter_r | 2013-11-20 17:00:27.945484-05| contributors | ADDED | Added successfully
45035996273733172 | public | twitter_r | 2013-11-20 17:00:27.94551-05 | entities.hashtags | ADDED | Added successfully
45035996273733172 | public | twitter_r | 2013-11-20 17:00:27.945519-05| entities.urls | ADDED | Added successfully
45035996273733172 | public | twitter_r | 2013-11-20 17:00:27.945532-05| created_at | EXISTS | Column of same name already
(4 rows)
Restores the keys table and the view. The function also links the keys table with its associated flex table, in cases where either table is dropped. The function also indicates whether it restored one or both objects.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
This example shows how to invoke this function with an existing flex table, restoring both the keys table and view:
=> SELECT RESTORE_FLEXTABLE_DEFAULT_KEYS_TABLE_AND_VIEW('darkdata');
RESTORE_FLEXTABLE_DEFAULT_KEYS_TABLE_AND_VIEW
----------------------------------------------------------------------------------
The keys table public.darkdata_keys was restored successfully.
The view public.darkdata_view was restored successfully.
(1 row)
This example illustrates that the function restored darkdata_view, but that darkdata_keys did not need restoring:
=> SELECT RESTORE_FLEXTABLE_DEFAULT_KEYS_TABLE_AND_VIEW('darkdata');
RESTORE_FLEXTABLE_DEFAULT_KEYS_TABLE_AND_VIEW
------------------------------------------------------------------------------------
The keys table public.darkdata_keys already exists and is linked to darkdata.
The view public.darkdata_view was restored successfully.
(1 row)
After restoring the keys table, there is no content. To populate the flex keys, call the COMPUTE_FLEXTABLE_KEYS function.
=> SELECT * FROM darkdata_keys;
key_name | frequency | data_type_guess
----------+-----------+-----------------
(0 rows)
The flex extractor scalar functions process polystructured data.
The flex extractor scalar functions process polystructured data. Each function accepts input data that is any of:
Existing database content
A table
Returned from an expression
Entered directly
These functions do not parse data from an external file source. All functions return a single VMap value. The extractor functions can return data with NULL-specified columns.
2.1 - MAPDELIMITEDEXTRACTOR
Extracts data with a delimiter character and other optional arguments, returning a single VMap value.
Extracts data with a delimiter character and other optional arguments, returning a single VMap value.
Syntax
MAPDELIMITEDEXTRACTOR (record-value [ USING PARAMETERS param=value[,...] ])
Arguments
record-value
String containing a JSON or delimited format record on which to apply the expression.
Parameters
delimiter
Single delimiter character.
Default:|
header_names
Delimiter-separated list of column header names.
Default:ucoln, where n is the column offset number, starting with 0 for the first column.
trim
Boolean, trim white space from header names and field values.
Default:true
treat_empty_val_as_null
Boolean, set empty fields to NULL rather than an empty string ('').
Create a columnar table, dtab, with an identity id column, a delim column, and a vmap column to hold a VMap:
=> CREATE TABLE dtab (id IDENTITY(1,1), delim varchar(128), vmap long varbinary(512));
CREATE TABLE
Use COPY to load the delim.dat file into the dtab table. MAPDELIMITEDEXTRACTOR uses the header_names parameter to specify a header row for the sample data, along with delimiter '!' :
=> COPY dtab(delim, vmap AS MAPDELIMITEDEXTRACTOR (delim
USING PARAMETERS header_names='Name|CITY|New City|State|Zip')) FROM '/home/dbadmin/data/delim.dat'
DELIMITER '!';
Rows Loaded
-------------
4
(1 row)
Use MAPTOSTRING for the flex table dflex to view the __raw__ column contents. Notice the default header names in use (ucol0 – ucol4), since you specified header='false' when you loaded the flex table:
Use MAPTOSTRING again, this time with the dtab table's vmap column. Compare the results of this output to those for the flex table. Note that MAPTOSTRING returns the header_name parameter values you specified when you loaded the data:
Extracts content of repeated JSON data objects,, including nested maps, or data with an outer list of JSON elements.
Extracts content of repeated JSON data objects,, including nested maps, or data with an outer list of JSON elements. You can set one or more optional parameters to control the extraction process.
Note
Empty input does not generate warnings or errors.
Syntax
MAPJSONEXTRACTOR (record-value [ USING PARAMETERS param=value[,...] ])
Arguments
record-value
String containing a JSON or delimited format record on which to apply the expression.
Parameters
flatten_maps
Boolean, flatten sub-maps within the JSON data, separating map levels with a period (.).
Default:true
flatten_arrays
Boolean, convert lists to sub-maps with integer keys. Lists are not flattened by default.
Default value:false
reject_on_duplicate
Boolean, ignore duplicate records (false), or reject duplicates (true). In either case, loading is unaffected.
Default:false
reject_on_empty_key
Boolean, reject any row that contains a key without a value.
Default:false
omit_empty_keys
Boolean, omit any key from the load data without a value.
Default:false
start_point
Name of a key in the JSON load data at which to begin parsing. The parser ignores all data before the start_point value. The parser processes data after the first instance, and up to the second, ignoring any remaining data.
Default: none
Examples
These examples use the following sample JSON data:
Extracts data with a regular expression and returns results as a VMap.
Extracts data with a regular expression and returns results as a VMap.
Syntax
MAPREGEXEXTRACTOR (record-value [ USING PARAMETERS param=value[,...] ])
Arguments
record-value
String containing a JSON or delimited format record on which to apply the regular expression.
Parameters
pattern
Regular expression used to extract the desired data.
Default: Empty string ('')
use_jit
Boolean, use just-in-time compiling when parsing the regular expression.
Default:false
record_terminator
Character used to separate input records.
Default:\n
logline_column
Destination column containing the full string that the regular expression matched.
Default: Empty string ('')
Examples
These examples use the following regular expression, which searches for information that includes the timestamp, date, thread_name, and thread_id strings.
Caution
For display purposes, this sample regular expression adds new line characters to split long lines of text. To use this expression in a query, first copy and edit the example to remove any new line characters.
This example expression loads any thread_id hex value, regardless of whether it has a 0x prefix, (<thread_id>(?:0x)?[0-9a-f]+).
The following examples may include newline characters for display purposes.
Create a flex table, flogs:
=> CREATE FLEX TABLE flogs();
CREATE TABLE
Use COPY to load a sample log file (vertica.log), using the flex table fregexparser. Note that this example includes added line characters for displaying long text lines.
The flex map functions let you extract and manipulate nested map data.
The flex map functions let you extract and manipulate nested map data.
The first argument of all flex map functions (except EMPTYMAP and MAPAGGREGATE) takes a VMap. The VMap can originate from the __raw__ column in a flex table or be returned from a map or extraction function.
All map functions (except EMPTYMAP and MAPAGGREGATE) accept either a LONG VARBINARY or a LONG VARCHAR map argument.
In the following example, the outer MAPLOOKUP function operates on the VMap data returned from the inner MAPLOOKUP function:
Constructs a new VMap with one row but without keys or data.
Constructs a new VMap with one row but without keys or data. Use this transform function to populate a map without using a flex parser. Instead, you use either from SQL queries or from map data present elsewhere in the database.
Table column with the keys for the key/value pairs of the returned VMap data. Keys with a NULL value are excluded. If there are duplicate keys, the duplicate key and value that appear first in the query result are used, while the other duplicates are omitted.
values-column
Table column with the values for the key/value pairs of the returned VMap data.
Parameters
max_vmap_length
Maximum length in bytes for the VMap result, an integer between 1-32000000 inclusive.
Default: 130000
on_overflow
Overflow behavior for cases when the VMap result is larger than the max_vmap_length. The value must be one of the following strings:
'ERROR': Returns an error when overflow occurs.
'TRUNCATE': Stops aggregating key/value pairs if the result exceeds max_vmap_length. The query executes, but the resulting VMap does not have all key/value pairs. When the provided max_vmap_length is not large enough to store an empty VMap, the result returned is NULL. Note that you need to specify order criteria in the OVER clause to get consistent results.
Call MAPAGGREGATE as follows to return the raw_map data of the resulting VMap:
=> SELECT raw_map FROM (SELECT MAPAGGREGATE(product, stock) OVER(ORDER BY product) FROM inventory) inventory;
raw_map
------------------------------------------------------------------------------------------------------------
\001\000\000\000\030\000\000\000\003\000\000\000\020\000\000\000\023\000\000\000\026\000\000\00020010050\003
\000\000\000\020\000\000\000\033\000\000\000!\000\000\000AutomobilesPlanesTrains
(1 row)
To transform the returned raw_map data into string representation, use MAPAGGREGATE with MAPTOSTRING:
=> SELECT MAPTOSTRING(raw_map) FROM (SELECT MAPAGGREGATE(product, stock) OVER(ORDER BY product) FROM
inventory) inventory;
MAPTOSTRING
--------------------------------------------------------------
{
"Automobiles": "200",
"Planes": "100",
"Trains": "50"
}
(1 row)
If you run the above query with on_overflow left as default and a max_vmap_length less than the returned VMap size, the function returns with an error message indicating the need to increase VMap length:
=> SELECT MAPTOSTRING(raw_map) FROM (SELECT MAPAGGREGATE(product, stock USING PARAMETERS max_vmap_length=60)
OVER(ORDER BY product) FROM inventory) inventory;
----------------------------------------------------------------------------------------------------------
ERROR 5861: Error calling processPartition() in User Function MapAggregate at [/data/jenkins/workspace
/RE-PrimaryBuilds/RE-Build-Master_2/server/udx/supported/flextable/Dict.cpp:1324], error code: 0, message:
Exception while finalizing map aggregation: Output VMap length is too small [60]. HINT: Set the parameter
max_vmap_length=71 and retry your query
Switching the value of on_overflow allows you to alter how MAPAGGREGATE behaves in the case of overflow. For example, changing on_overflow to 'RETURN_NULL' causes the above query to execute and return NULL:
SELECT raw_map IS NULL FROM (SELECT MAPAGGREGATE(product, stock USING PARAMETERS max_vmap_length=60,
on_overflow='RETURN_NULL') OVER(ORDER BY product) FROM inventory) inventory;
?column?
----------
t
(1 row)
If on_overflow is set to 'TRUNCATE', the resulting VMap has enough space for two of the key/value pairs, but must cut the third:
SELECT raw_map IS NULL FROM (SELECT MAPAGGREGATE(product, stock USING PARAMETERS max_vmap_length=60,
on_overflow='TRUNCATE') OVER(ORDER BY product) FROM inventory) inventory;
MAPTOSTRING
---------------------------------------------
{
"Automobiles": "200",
"Planes": "100"
}
(1 row)
Determines whether a VMap contains a virtual column (key).
Determines whether a VMap contains a virtual column (key). This scalar function returns true (t), if the virtual column exists, or false (f) if it does not. Determining that a key exists before calling maplookup() lets you distinguish between NULL returns. The maplookup() function uses for both a non-existent key and an existing key with a NULL value.
Syntax
MAPCONTAINSKEY (VMap-data, 'virtual-column-name')
Arguments
VMap-data
Any VMap data. The VMap can exist as:
The __raw__ column of a flex table
Data returned from a map function such as MAPLOOKUP
Other database content
virtual-column-name
Name of the key to check.
Examples
This example shows how to use the mapcontainskey() functions with maplookup(). View the results returned from both functions. Check whether the empty fields that maplookup() returns indicate a NULL value for the row (t) or no value (f):
You can use mapcontainskey( ) to determine that a key exists before calling maplookup(). The maplookup() function uses both NULL returns and existing keys with NULL values to indicate a non-existent key.
=> SELECT MAPLOOKUP(__raw__, 'user.location'), MAPCONTAINSKEY(__raw__, 'user.location')
FROM darkdata ORDER BY 1;
maplookup | mapcontainskey
-----------+----------------
| t
| t
| t
| t
Chile | t
Narnia | t
Uptown.. | t
chicago | t
| f
| f
| f
| f
(12 rows)
Data returned from a map function such as MAPLOOKUP
Other database content
virtual-column-value
Value to confirm.
Examples
This example shows how to use mapcontainsvalue() to determine whether or not a virtual column contains a particular value. Create a flex table (ftest), and populate it with some virtual columns and values. Name both virtual columns one:
=> CREATE FLEX TABLE ftest();
CREATE TABLE
=> copy ftest from stdin parser fjsonparser();
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> {"one":1, "two":2}
>> {"one":"one","2":"2"}
>> \.
Call mapcontainsvalue() on the ftest map data. The query returns false (f) for the first virtual column, and true (t) for the second , which contains the value one:
=> SELECT MAPCONTAINSVALUE(__raw__, 'one') FROM ftest;
mapcontainsvalue
------------------
f
t
(2 rows)
Returns information about items in a VMap. Use this transform function with one or more optional arguments to access polystructured values within the VMap data. This function requires an over()` clause.
Syntax
MAPITEMS (VMap-data [, passthrough-arg[,...] ])
Arguments
VMap-data
Any VMap data. The VMap can exist as:
The __raw__ column of a flex table
Data returned from a map function such as MAPLOOKUP
Other database content
max_key_length
In a __raw__ column, determines the maximum length of keys that the function can return. Keys that are longer than max_key_length cause the query to fail. Defaults to the smaller of VMap column length and 65K.
max_value_length
In a __raw__ column, determines the maximum length of values the function can return. Values that are larger than max_value_length cause the query to fail. Defaults to the smaller of VMap column length and 65K.
passthrough-arg
One or more arguments indicating keys within the map data in VMap-data.
Examples
The following examples illustrate using MAPITEMS()with the over(PARTITION BEST) clause.
This example determines the number of virtual columns in the map data using a flex table, labeled darkmountain. Query using the count() function to return the number of virtual columns in the map data:
=> SELECT COUNT(keys) FROM (SELECT MAPITEMS(darkmountain.__raw__) OVER(PARTITION BEST) FROM
darkmountain) AS a;
count
-------
19
(1 row)
The next example determines what items exist in the map data:
=> SELECT * FROM (SELECT MAPITEMS(darkmountain.__raw__) OVER(PARTITION BEST) FROM darkmountain) AS a;
keys | values
-------------+---------------
hike_safety | 50.6
name | Mt Washington
type | mountain
height | 17000
hike_safety | 12.2
name | Denali
type | mountain
height | 29029
hike_safety | 34.1
name | Everest
type | mountain
height | 14000
hike_safety | 22.8
name | Kilimanjaro
type | mountain
height | 29029
hike_safety | 15.4
name | Mt St Helens
type | volcano
(19 rows)
The following example shows how to restrict the length of returned values to 100000:
=> SELECT LENGTH(keys), LENGTH(values) FROM (SELECT MAPITEMS(__raw__ USING PARAMETERS max_value_length=100000) OVER() FROM t1) x;
LENGTH | LENGTH
--------+--------
9 | 98899
(1 row)
Directly Query a Key Value in a VMap
Review the following JSON input file, simple.json. In particular, notice the array called three_Array, and its four values:
Call MAPKEYS on the flex table's __raw__ column to see the flex table's keys, but not the key submaps. The return values indicate three_Array as one of the virtual columns:
=> SELECT MAPKEYS(__raw__) OVER() FROM mapper;
keys
-------------
five_Map
four
one
six
three_Array
two
(6 rows)
Call mapitems on flex table mapper with three_Array as a pass-through argument to the function. The call returns these array values:
Returns the virtual columns (and values) present in any VMap data.
Returns the virtual columns (and values) present in any VMap data. This transform function requires an OVER(PARTITION BEST) clause.
Syntax
MAPKEYS (VMap-data)
Arguments
VMap-data
Any VMap data. The VMap can exist as:
The __raw__ column of a flex table
Data returned from a map function such as MAPLOOKUP
Other database content
max_key_length
In a __raw__ column, specifies the maximum length of keys that the function can return. Keys that are longer than max_key_length cause the query to fail. Defaults to the smaller of VMap column length and 65K.
Examples
Determine Number of Virtual Columns in Map Data
This example shows how to create a query, using an over(PARTITION BEST) clause with a flex table, darkdata to find the number of virtual column in the map data. The table is populated with JSON tweet data.
=> SELECT COUNT(keys) FROM (SELECT MAPKEYS(darkdata.__raw__) OVER(PARTITION BEST) FROM darkdata) AS a;
count
-------
550
(1 row)
Query Ordered List of All Virtual Columns in the Map
This example shows a snippet of the return data when you query an ordered list of all virtual columns in the map data:
=> SELECT * FROM (SELECT MAPKEYS(darkdata.__raw__) OVER(PARTITION BEST) FROM darkdata) AS a;
keys
-------------------------------------
contributors
coordinates
created_ at
delete.status.id
delete.status.id_str
delete.status.user_id
delete.status.user_id_str
entities.hashtags
entities.media
entities.urls
entities.user_mentions
favorited
geo
id
.
.
.
user.statuses_count
user.time_zone
user.url
user.utc_offset
user.verified
(125 rows)
Specify the Maximum Length of Keys that MAPKEYS Can Return
=> SELECT MAPKEYS(__raw__ USING PARAMETERS max_key_length=100000) OVER() FROM mapper;
keys
-------------
five_Map
four
one
six
three_Array
two
(6 rows)
Returns virtual column information from a given map.
Returns virtual column information from a given map. This transform function requires an OVER(PARTITION BEST) clause.
Syntax
MAPKEYSINFO (VMap-data)
Arguments
VMap-data
Any VMap data. The VMap can exist as:
The __raw__ column of a flex table
Data returned from a map function such as MAPLOOKUP
Other database content
max_key_length
In a __raw__ column, determines the maximum length of keys that the function can return. Keys that are longer than max_key_length cause the query to fail. Defaults to the smaller of VMap column length and 65K.
Returns
This function is a superset of the MAPKEYS() function. It returns the following information about each virtual column:
Column
Description
keys
The virtual column names in the raw data.
length
The data length of the key name, which can differ from the actual string length.
type_oid
The OID type into which the value should be converted. Currently, the type is always 116 for a LONG VARCHAR, or 199 for a nested map that is stored as a LONG VARBINARY.
row_num
The number of rows in which the key was found.
field_num
The field number in which the key exists.
Examples
This example shows a snippet of the return data you receive if you query an ordered list of all virtual columns in the map data:
Specify the Maximum Length of Keys that MAPKEYSINFO Can Return
=> SELECT MAPKEYSINFO(__raw__ USING PARAMETERS max_key_length=100000) OVER() FROM mapper;
keys
-------------
five_Map
four
one
six
three_Array
two
(6 rows)
Returns single-key values from VMAP data. This scalar function returns a LONG VARCHAR, with values, or NULL if the virtual column does not have a value.
Using maplookup is case insensitive to virtual column names. To avoid loading same-name values, set the fjsonparser parser reject_on_duplicate parameter to true when data loading.
You can control the behavior for non-scalar values in a VMAP (like arrays), when loading data with the fjsonparser or favroparser parsers and its flatten-arrays argument. See JSON data and the FJSONPARSER reference.
For information about using maplookup() to access nested JSON data, see Querying nested data.
Data returned from a map function such as MAPLOOKUP
Other database content
virtual-column-name
The name of the virtual column whose values this function returns.
buffer_size
[Optional parameter] Specifies the maximum length (in bytes) of each value returned for virtual-column-name. To return all values for virtual-column-name, specify a buffer_size equal to or greater than (=>) the number of bytes for any returned value. Any returned values greater in length than buffer_size are rejected.
Default:0 (No limit on buffer_size)
case_sensitive
[Optional parameter]
Specifies whether to return values for virtual-column-name if keys with different cases exist.
Example:
(... USING PARAMETERS case_sensitive=true)
Default:false
Examples
This example returns the values of one virtual column, user.location:
=> SELECT MAPLOOKUP(__raw__, 'user.location') FROM darkdata ORDER BY 1;
maplookup
-----------
Chile
Nesnia
Uptown
.
.
chicago
(12 rows)
Using maplookup buffer_size
Use the buffer_size= parameter to indicate the maximum length of any value that maplookup returns for the virtual column you specify. If none of the returned key values can be greater than n bytes, use this parameter to allocate n bytes as the buffer_size.
For the next example, save this JSON data to a file, simple_name.json:
Load the simple_name.json data into logs, using the fjsonparser. Specify the flatten_arrays option as True:
=> COPY logs FROM '/home/dbadmin/data/simple_name.json'
PARSER fjsonparser(flatten_arrays=True);
Use maplookup with buffer_size=0 for the logs table name key. This query returns all of the values:
=> SELECT MAPLOOKUP(__raw__, 'name' USING PARAMETERS buffer_size=0) FROM logs;
MapLookup
-----------
sierra
ben
janis
jen
(4 rows)
Next, call maplookup() three times, specifying the buffer_size parameter as 3, 5, and 6, respectively. Now, maplookup() returns values with a byte length less than or equal to (<=) buffer_size:
=> SELECT MAPLOOKUP(__raw__, 'name' USING PARAMETERS buffer_size=3) FROM logs;
MapLookup
-----------
ben
jen
(4 rows)
=> SELECT MAPLOOKUP(__raw__, 'name' USING PARAMETERS buffer_size=5) FROM logs;
MapLookup
-----------
janis
jen
ben
(4 rows)
=> SELECT MAPLOOKUP(__raw__, 'name' USING PARAMETERS buffer_size=6) FROM logs;
MapLookup
-----------
sierra
janis
jen
ben
(4 rows)
Disambiguate Empty Output Rows
This example shows how to interpret empty rows. Using maplookup without first checking whether a key exists can be ambiguous. When you review the following output, 12 empty rows, you cannot determine whether a user.location key has:
A non-NULL value
A NULL value
No value
=> SELECT MAPLOOKUP(__raw__, 'user.location') FROM darkdata;
maplookup
-----------
(12 rows)
To disambiguate empty output rows, use the mapcontainskey() function in conjunction with maplookup(). When maplookup returns an empty field, the corresponding value from mapcontainskey indicates t for a NULL or other value, or ffor no value.
The following example output using both functions lists rows with NULL or a name value as t, and rows with no value as f:
=> SELECT MAPLOOKUP(__raw__, 'user.location'), MAPCONTAINSKEY(__raw__, 'user.location')
FROM darkdata ORDER BY 1;
maplookup | mapcontainskey
-----------+----------------
| t
| t
| t
| t
Chile | t
Nesnia | t
Uptown | t
chicago | t
| f >>>>>>>>>>No value
| f >>>>>>>>>>No value
| f >>>>>>>>>>No value
| f >>>>>>>>>>No value
(12 rows)
Check for Case-Sensitive Virtual Columns
You can use maplookup() with the case_sensitive parameter to return results when key names with different cases exist.
Save the following sample content as a JSON file. This example saves the file as repeated_key_name.json:
Accepts a VMap and one or more key/value pairs and returns a new VMap with the key/value pairs added.
Accepts a VMap and one or more key/value pairs and returns a new VMap with the key/value pairs added. Keys must be set using the auxiliary function SetMapKeys(), and can only be constant strings. If the VMap has any of the new input keys, then the original values are replaced by the new ones.
Syntax
MAPPUT (VMap-data, value[,...] USING PARAMETERS keys=SetMapKeys('key'[,...])
Arguments
VMap-data
Any VMap data. The VMap can exist as:
The __raw__ column of a flex table
Data returned from a map function such as MAPLOOKUP.
Other database content
value[,...]
One or more values to add to the VMap specified in VMap-data.
Parameters
keys
The result of SetMapKeys(). SetMapKeys() takes one or more constant string arguments.
The following example shows how to create a flex table and use COPY to enter some basic JSON data. After creating a second flex table, insert the new VMap results from mapput(), with additional key/value pairs.
Create sample table:
=> CREATE FLEX TABLE vmapdata1();
CREATE TABLE
Load sample JSON data from STDIN:
=> COPY vmapdata1 FROM stdin parser fjsonparser();
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> {"aaa": 1, "bbb": 2, "ccc": 3}
>> \.
Create another flex table and use the function to insert data into it: => CREATE FLEX TABLE vmapdata2(); => INSERT INTO vmapdata2 SELECT MAPPUT(__raw__, '7','8','9' using parameters keys=SetMapKeys('xxx','yyy','zzz')) from vmapdata1;
View the difference between the original and the new flex tables:
Recursively builds a string representation of VMap data, including nested JSON maps.
Recursively builds a string representation of VMap data, including nested JSON maps. Use this transform function to display the VMap contents in a LONG VARCHAR format. You can use MAPTOSTRING to see how map data is nested before querying virtual columns with MAPVALUES.
Syntax
MAPTOSTRING ( VMap-data [ USING PARAMETERS param=value ] )
Arguments
VMap-data
Any VMap data. The VMap can exist as:
The __raw__ column of a flex table
Data returned from a map function such as MAPLOOKUP
Other database content
Parameters
canonical_json
Boolean, whether to produce canonical JSON format, using the first instance of any duplicate keys in the map data. If false, the function returns duplicate keys and their values.
Default: true
Examples
The following example uses this table definition and sample data:
=> CREATE FLEX TABLE darkdata();
CREATE TABLE
=> COPY darkdata FROM stdin parser fjsonparser();
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> {"aaa": 1, "aaa": 2, "AAA": 3, "bbb": "aaa\"bbb"}
>> \.
Calling MAPTOSTRING with the default value of canonical_json returns only the first instance of the duplicate key:
Returns a string representation of the top-level values from a VMap.
Returns a string representation of the top-level values from a VMap. This transform function requires an OVER() clause.
Syntax
MAPVALUES (VMap-data)
Arguments
VMap-data
Any VMap data. The VMap can exist as:
The __raw__ column of a flex table
Data returned from a map function such as MAPLOOKUP
Other database content
max_value_length
In a __raw__ column, specifies the maximum length of values the function can return. Values that are larger than max_value_length cause the query to fail. Defaults to the smaller of VMap column length and 65K.
Examples
The following example shows how to query a darkmountain flex table, using an over() clause (in this case, the over(PARTITION BEST) clause) with mapvalues().
=> SELECT * FROM (SELECT MAPVALUES(darkmountain.__raw__) OVER(PARTITION BEST) FROM darkmountain) AS a;
values
---------------
29029
34.1
Everest
mountain
29029
15.4
Mt St Helens
volcano
17000
12.2
Denali
mountain
14000
22.8
Kilimanjaro
mountain
50.6
Mt Washington
mountain
(19 rows)
Specify the Maximum Length of Values that MAPVALUES Can Return
=> SELECT MAPVALUES(__raw__ USING PARAMETERS max_value_length=100000) OVER() FROM mapper;
keys
-------------
five_Map
four
one
six
three_Array
two
(6 rows)