The flex map functions let you extract and manipulate nested map data.
The first argument of all flex map functions (except EMPTYMAP and MAPAGGREGATE) takes a VMap. The VMap can originate from the __raw__ column in a flex table or be returned from a map or extraction function.
All map functions (except EMPTYMAP and MAPAGGREGATE) accept either a LONG VARBINARY or a LONG VARCHAR map argument.
In the following example, the outer MAPLOOKUP function operates on the VMap data returned from the inner MAPLOOKUP function:
Constructs a new VMap with one row but without keys or data.
Constructs a new VMap with one row but without keys or data. Use this transform function to populate a map without using a flex parser. Instead, you use either from SQL queries or from map data present elsewhere in the database.
Table column with the keys for the key/value pairs of the returned VMap data. Keys with a NULL value are excluded. If there are duplicate keys, the duplicate key and value that appear first in the query result are used, while the other duplicates are omitted.
values-column
Table column with the values for the key/value pairs of the returned VMap data.
Parameters
max_vmap_length
Maximum length in bytes for the VMap result, an integer between 1-32000000 inclusive.
Default: 130000
on_overflow
Overflow behavior for cases when the VMap result is larger than the max_vmap_length. The value must be one of the following strings:
'ERROR': Returns an error when overflow occurs.
'TRUNCATE': Stops aggregating key/value pairs if the result exceeds max_vmap_length. The query executes, but the resulting VMap does not have all key/value pairs. When the provided max_vmap_length is not large enough to store an empty VMap, the result returned is NULL. Note that you need to specify order criteria in the OVER clause to get consistent results.
Call MAPAGGREGATE as follows to return the raw_map data of the resulting VMap:
=> SELECT raw_map FROM (SELECT MAPAGGREGATE(product, stock) OVER(ORDER BY product) FROM inventory) inventory;
raw_map
------------------------------------------------------------------------------------------------------------
\001\000\000\000\030\000\000\000\003\000\000\000\020\000\000\000\023\000\000\000\026\000\000\00020010050\003
\000\000\000\020\000\000\000\033\000\000\000!\000\000\000AutomobilesPlanesTrains
(1 row)
To transform the returned raw_map data into string representation, use MAPAGGREGATE with MAPTOSTRING:
=> SELECT MAPTOSTRING(raw_map) FROM (SELECT MAPAGGREGATE(product, stock) OVER(ORDER BY product) FROM
inventory) inventory;
MAPTOSTRING
--------------------------------------------------------------
{
"Automobiles": "200",
"Planes": "100",
"Trains": "50"
}
(1 row)
If you run the above query with on_overflow left as default and a max_vmap_length less than the returned VMap size, the function returns with an error message indicating the need to increase VMap length:
=> SELECT MAPTOSTRING(raw_map) FROM (SELECT MAPAGGREGATE(product, stock USING PARAMETERS max_vmap_length=60)
OVER(ORDER BY product) FROM inventory) inventory;
----------------------------------------------------------------------------------------------------------
ERROR 5861: Error calling processPartition() in User Function MapAggregate at [/data/jenkins/workspace
/RE-PrimaryBuilds/RE-Build-Master_2/server/udx/supported/flextable/Dict.cpp:1324], error code: 0, message:
Exception while finalizing map aggregation: Output VMap length is too small [60]. HINT: Set the parameter
max_vmap_length=71 and retry your query
Switching the value of on_overflow allows you to alter how MAPAGGREGATE behaves in the case of overflow. For example, changing on_overflow to 'RETURN_NULL' causes the above query to execute and return NULL:
SELECT raw_map IS NULL FROM (SELECT MAPAGGREGATE(product, stock USING PARAMETERS max_vmap_length=60,
on_overflow='RETURN_NULL') OVER(ORDER BY product) FROM inventory) inventory;
?column?
----------
t
(1 row)
If on_overflow is set to 'TRUNCATE', the resulting VMap has enough space for two of the key/value pairs, but must cut the third:
SELECT raw_map IS NULL FROM (SELECT MAPAGGREGATE(product, stock USING PARAMETERS max_vmap_length=60,
on_overflow='TRUNCATE') OVER(ORDER BY product) FROM inventory) inventory;
MAPTOSTRING
---------------------------------------------
{
"Automobiles": "200",
"Planes": "100"
}
(1 row)
Determines whether a VMap contains a virtual column (key).
Determines whether a VMap contains a virtual column (key). This scalar function returns true (t), if the virtual column exists, or false (f) if it does not. Determining that a key exists before calling maplookup() lets you distinguish between NULL returns. The maplookup() function uses for both a non-existent key and an existing key with a NULL value.
Syntax
MAPCONTAINSKEY (VMap-data, 'virtual-column-name')
Arguments
VMap-data
Any VMap data. The VMap can exist as:
The __raw__ column of a flex table
Data returned from a map function such as MAPLOOKUP
Other database content
virtual-column-name
Name of the key to check.
Examples
This example shows how to use the mapcontainskey() functions with maplookup(). View the results returned from both functions. Check whether the empty fields that maplookup() returns indicate a NULL value for the row (t) or no value (f):
You can use mapcontainskey( ) to determine that a key exists before calling maplookup(). The maplookup() function uses both NULL returns and existing keys with NULL values to indicate a non-existent key.
=> SELECT MAPLOOKUP(__raw__, 'user.location'), MAPCONTAINSKEY(__raw__, 'user.location')
FROM darkdata ORDER BY 1;
maplookup | mapcontainskey
-----------+----------------
| t
| t
| t
| t
Chile | t
Narnia | t
Uptown.. | t
chicago | t
| f
| f
| f
| f
(12 rows)
Data returned from a map function such as MAPLOOKUP
Other database content
virtual-column-value
Value to confirm.
Examples
This example shows how to use mapcontainsvalue() to determine whether or not a virtual column contains a particular value. Create a flex table (ftest), and populate it with some virtual columns and values. Name both virtual columns one:
=> CREATE FLEX TABLE ftest();
CREATE TABLE
=> copy ftest from stdin parser fjsonparser();
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> {"one":1, "two":2}
>> {"one":"one","2":"2"}
>> \.
Call mapcontainsvalue() on the ftest map data. The query returns false (f) for the first virtual column, and true (t) for the second , which contains the value one:
=> SELECT MAPCONTAINSVALUE(__raw__, 'one') FROM ftest;
mapcontainsvalue
------------------
f
t
(2 rows)
Returns information about items in a VMap. Use this transform function with one or more optional arguments to access polystructured values within the VMap data. This function requires an over()` clause.
Syntax
MAPITEMS (VMap-data [, passthrough-arg[,...] ])
Arguments
VMap-data
Any VMap data. The VMap can exist as:
The __raw__ column of a flex table
Data returned from a map function such as MAPLOOKUP
Other database content
max_key_length
In a __raw__ column, determines the maximum length of keys that the function can return. Keys that are longer than max_key_length cause the query to fail. Defaults to the smaller of VMap column length and 65K.
max_value_length
In a __raw__ column, determines the maximum length of values the function can return. Values that are larger than max_value_length cause the query to fail. Defaults to the smaller of VMap column length and 65K.
passthrough-arg
One or more arguments indicating keys within the map data in VMap-data.
Examples
The following examples illustrate using MAPITEMS()with the over(PARTITION BEST) clause.
This example determines the number of virtual columns in the map data using a flex table, labeled darkmountain. Query using the count() function to return the number of virtual columns in the map data:
=> SELECT COUNT(keys) FROM (SELECT MAPITEMS(darkmountain.__raw__) OVER(PARTITION BEST) FROM
darkmountain) AS a;
count
-------
19
(1 row)
The next example determines what items exist in the map data:
=> SELECT * FROM (SELECT MAPITEMS(darkmountain.__raw__) OVER(PARTITION BEST) FROM darkmountain) AS a;
keys | values
-------------+---------------
hike_safety | 50.6
name | Mt Washington
type | mountain
height | 17000
hike_safety | 12.2
name | Denali
type | mountain
height | 29029
hike_safety | 34.1
name | Everest
type | mountain
height | 14000
hike_safety | 22.8
name | Kilimanjaro
type | mountain
height | 29029
hike_safety | 15.4
name | Mt St Helens
type | volcano
(19 rows)
The following example shows how to restrict the length of returned values to 100000:
=> SELECT LENGTH(keys), LENGTH(values) FROM (SELECT MAPITEMS(__raw__ USING PARAMETERS max_value_length=100000) OVER() FROM t1) x;
LENGTH | LENGTH
--------+--------
9 | 98899
(1 row)
Directly Query a Key Value in a VMap
Review the following JSON input file, simple.json. In particular, notice the array called three_Array, and its four values:
Call MAPKEYS on the flex table's __raw__ column to see the flex table's keys, but not the key submaps. The return values indicate three_Array as one of the virtual columns:
=> SELECT MAPKEYS(__raw__) OVER() FROM mapper;
keys
-------------
five_Map
four
one
six
three_Array
two
(6 rows)
Call mapitems on flex table mapper with three_Array as a pass-through argument to the function. The call returns these array values:
Returns the virtual columns (and values) present in any VMap data.
Returns the virtual columns (and values) present in any VMap data. This transform function requires an OVER(PARTITION BEST) clause.
Syntax
MAPKEYS (VMap-data)
Arguments
VMap-data
Any VMap data. The VMap can exist as:
The __raw__ column of a flex table
Data returned from a map function such as MAPLOOKUP
Other database content
max_key_length
In a __raw__ column, specifies the maximum length of keys that the function can return. Keys that are longer than max_key_length cause the query to fail. Defaults to the smaller of VMap column length and 65K.
Examples
Determine Number of Virtual Columns in Map Data
This example shows how to create a query, using an over(PARTITION BEST) clause with a flex table, darkdata to find the number of virtual column in the map data. The table is populated with JSON tweet data.
=> SELECT COUNT(keys) FROM (SELECT MAPKEYS(darkdata.__raw__) OVER(PARTITION BEST) FROM darkdata) AS a;
count
-------
550
(1 row)
Query Ordered List of All Virtual Columns in the Map
This example shows a snippet of the return data when you query an ordered list of all virtual columns in the map data:
=> SELECT * FROM (SELECT MAPKEYS(darkdata.__raw__) OVER(PARTITION BEST) FROM darkdata) AS a;
keys
-------------------------------------
contributors
coordinates
created_ at
delete.status.id
delete.status.id_str
delete.status.user_id
delete.status.user_id_str
entities.hashtags
entities.media
entities.urls
entities.user_mentions
favorited
geo
id
.
.
.
user.statuses_count
user.time_zone
user.url
user.utc_offset
user.verified
(125 rows)
Specify the Maximum Length of Keys that MAPKEYS Can Return
=> SELECT MAPKEYS(__raw__ USING PARAMETERS max_key_length=100000) OVER() FROM mapper;
keys
-------------
five_Map
four
one
six
three_Array
two
(6 rows)
Returns virtual column information from a given map.
Returns virtual column information from a given map. This transform function requires an OVER(PARTITION BEST) clause.
Syntax
MAPKEYSINFO (VMap-data)
Arguments
VMap-data
Any VMap data. The VMap can exist as:
The __raw__ column of a flex table
Data returned from a map function such as MAPLOOKUP
Other database content
max_key_length
In a __raw__ column, determines the maximum length of keys that the function can return. Keys that are longer than max_key_length cause the query to fail. Defaults to the smaller of VMap column length and 65K.
Returns
This function is a superset of the MAPKEYS() function. It returns the following information about each virtual column:
Column
Description
keys
The virtual column names in the raw data.
length
The data length of the key name, which can differ from the actual string length.
type_oid
The OID type into which the value should be converted. Currently, the type is always 116 for a LONG VARCHAR, or 199 for a nested map that is stored as a LONG VARBINARY.
row_num
The number of rows in which the key was found.
field_num
The field number in which the key exists.
Examples
This example shows a snippet of the return data you receive if you query an ordered list of all virtual columns in the map data:
Specify the Maximum Length of Keys that MAPKEYSINFO Can Return
=> SELECT MAPKEYSINFO(__raw__ USING PARAMETERS max_key_length=100000) OVER() FROM mapper;
keys
-------------
five_Map
four
one
six
three_Array
two
(6 rows)
Returns single-key values from VMAP data. This scalar function returns a LONG VARCHAR, with values, or NULL if the virtual column does not have a value.
Using maplookup is case insensitive to virtual column names. To avoid loading same-name values, set the fjsonparser parser reject_on_duplicate parameter to true when data loading.
You can control the behavior for non-scalar values in a VMAP (like arrays), when loading data with the fjsonparser or favroparser parsers and its flatten-arrays argument. See JSON data and the FJSONPARSER reference.
For information about using maplookup() to access nested JSON data, see Querying nested data.
Data returned from a map function such as MAPLOOKUP
Other database content
virtual-column-name
The name of the virtual column whose values this function returns.
buffer_size
[Optional parameter] Specifies the maximum length (in bytes) of each value returned for virtual-column-name. To return all values for virtual-column-name, specify a buffer_size equal to or greater than (=>) the number of bytes for any returned value. Any returned values greater in length than buffer_size are rejected.
Default:0 (No limit on buffer_size)
case_sensitive
[Optional parameter]
Specifies whether to return values for virtual-column-name if keys with different cases exist.
Example:
(... USING PARAMETERS case_sensitive=true)
Default:false
Examples
This example returns the values of one virtual column, user.location:
=> SELECT MAPLOOKUP(__raw__, 'user.location') FROM darkdata ORDER BY 1;
maplookup
-----------
Chile
Nesnia
Uptown
.
.
chicago
(12 rows)
Using maplookup buffer_size
Use the buffer_size= parameter to indicate the maximum length of any value that maplookup returns for the virtual column you specify. If none of the returned key values can be greater than n bytes, use this parameter to allocate n bytes as the buffer_size.
For the next example, save this JSON data to a file, simple_name.json:
Load the simple_name.json data into logs, using the fjsonparser. Specify the flatten_arrays option as True:
=> COPY logs FROM '/home/dbadmin/data/simple_name.json'
PARSER fjsonparser(flatten_arrays=True);
Use maplookup with buffer_size=0 for the logs table name key. This query returns all of the values:
=> SELECT MAPLOOKUP(__raw__, 'name' USING PARAMETERS buffer_size=0) FROM logs;
MapLookup
-----------
sierra
ben
janis
jen
(4 rows)
Next, call maplookup() three times, specifying the buffer_size parameter as 3, 5, and 6, respectively. Now, maplookup() returns values with a byte length less than or equal to (<=) buffer_size:
=> SELECT MAPLOOKUP(__raw__, 'name' USING PARAMETERS buffer_size=3) FROM logs;
MapLookup
-----------
ben
jen
(4 rows)
=> SELECT MAPLOOKUP(__raw__, 'name' USING PARAMETERS buffer_size=5) FROM logs;
MapLookup
-----------
janis
jen
ben
(4 rows)
=> SELECT MAPLOOKUP(__raw__, 'name' USING PARAMETERS buffer_size=6) FROM logs;
MapLookup
-----------
sierra
janis
jen
ben
(4 rows)
Disambiguate Empty Output Rows
This example shows how to interpret empty rows. Using maplookup without first checking whether a key exists can be ambiguous. When you review the following output, 12 empty rows, you cannot determine whether a user.location key has:
A non-NULL value
A NULL value
No value
=> SELECT MAPLOOKUP(__raw__, 'user.location') FROM darkdata;
maplookup
-----------
(12 rows)
To disambiguate empty output rows, use the mapcontainskey() function in conjunction with maplookup(). When maplookup returns an empty field, the corresponding value from mapcontainskey indicates t for a NULL or other value, or ffor no value.
The following example output using both functions lists rows with NULL or a name value as t, and rows with no value as f:
=> SELECT MAPLOOKUP(__raw__, 'user.location'), MAPCONTAINSKEY(__raw__, 'user.location')
FROM darkdata ORDER BY 1;
maplookup | mapcontainskey
-----------+----------------
| t
| t
| t
| t
Chile | t
Nesnia | t
Uptown | t
chicago | t
| f >>>>>>>>>>No value
| f >>>>>>>>>>No value
| f >>>>>>>>>>No value
| f >>>>>>>>>>No value
(12 rows)
Check for Case-Sensitive Virtual Columns
You can use maplookup() with the case_sensitive parameter to return results when key names with different cases exist.
Save the following sample content as a JSON file. This example saves the file as repeated_key_name.json:
Accepts a VMap and one or more key/value pairs and returns a new VMap with the key/value pairs added.
Accepts a VMap and one or more key/value pairs and returns a new VMap with the key/value pairs added. Keys must be set using the auxiliary function SetMapKeys(), and can only be constant strings. If the VMap has any of the new input keys, then the original values are replaced by the new ones.
Syntax
MAPPUT (VMap-data, value[,...] USING PARAMETERS keys=SetMapKeys('key'[,...])
Arguments
VMap-data
Any VMap data. The VMap can exist as:
The __raw__ column of a flex table
Data returned from a map function such as MAPLOOKUP.
Other database content
value[,...]
One or more values to add to the VMap specified in VMap-data.
Parameters
keys
The result of SetMapKeys(). SetMapKeys() takes one or more constant string arguments.
The following example shows how to create a flex table and use COPY to enter some basic JSON data. After creating a second flex table, insert the new VMap results from mapput(), with additional key/value pairs.
Create sample table:
=> CREATE FLEX TABLE vmapdata1();
CREATE TABLE
Load sample JSON data from STDIN:
=> COPY vmapdata1 FROM stdin parser fjsonparser();
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> {"aaa": 1, "bbb": 2, "ccc": 3}
>> \.
Create another flex table and use the function to insert data into it: => CREATE FLEX TABLE vmapdata2(); => INSERT INTO vmapdata2 SELECT MAPPUT(__raw__, '7','8','9' using parameters keys=SetMapKeys('xxx','yyy','zzz')) from vmapdata1;
View the difference between the original and the new flex tables:
Recursively builds a string representation of VMap data, including nested JSON maps.
Recursively builds a string representation of VMap data, including nested JSON maps. Use this transform function to display the VMap contents in a LONG VARCHAR format. You can use MAPTOSTRING to see how map data is nested before querying virtual columns with MAPVALUES.
Syntax
MAPTOSTRING ( VMap-data [ USING PARAMETERS param=value ] )
Arguments
VMap-data
Any VMap data. The VMap can exist as:
The __raw__ column of a flex table
Data returned from a map function such as MAPLOOKUP
Other database content
Parameters
canonical_json
Boolean, whether to produce canonical JSON format, using the first instance of any duplicate keys in the map data. If false, the function returns duplicate keys and their values.
Default: true
Examples
The following example uses this table definition and sample data:
=> CREATE FLEX TABLE darkdata();
CREATE TABLE
=> COPY darkdata FROM stdin parser fjsonparser();
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> {"aaa": 1, "aaa": 2, "AAA": 3, "bbb": "aaa\"bbb"}
>> \.
Calling MAPTOSTRING with the default value of canonical_json returns only the first instance of the duplicate key:
Returns a string representation of the top-level values from a VMap.
Returns a string representation of the top-level values from a VMap. This transform function requires an OVER() clause.
Syntax
MAPVALUES (VMap-data)
Arguments
VMap-data
Any VMap data. The VMap can exist as:
The __raw__ column of a flex table
Data returned from a map function such as MAPLOOKUP
Other database content
max_value_length
In a __raw__ column, specifies the maximum length of values the function can return. Values that are larger than max_value_length cause the query to fail. Defaults to the smaller of VMap column length and 65K.
Examples
The following example shows how to query a darkmountain flex table, using an over() clause (in this case, the over(PARTITION BEST) clause) with mapvalues().
=> SELECT * FROM (SELECT MAPVALUES(darkmountain.__raw__) OVER(PARTITION BEST) FROM darkmountain) AS a;
values
---------------
29029
34.1
Everest
mountain
29029
15.4
Mt St Helens
volcano
17000
12.2
Denali
mountain
14000
22.8
Kilimanjaro
mountain
50.6
Mt Washington
mountain
(19 rows)
Specify the Maximum Length of Values that MAPVALUES Can Return
=> SELECT MAPVALUES(__raw__ USING PARAMETERS max_value_length=100000) OVER() FROM mapper;
keys
-------------
five_Map
four
one
six
three_Array
two
(6 rows)