This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
Collection functions
The functions in this section apply to collection types (arrays and sets).
The functions in this section apply to collection types (arrays and sets).
Some functions apply aggregation operations (such as sum) to collections. These function names all begin with APPLY.
Other functions in this section operate specifically on arrays or sets, as indicated on the individual reference pages. Array functions operate on both native array values and array values in external tables.
Notes
-
Arrays are 0-indexed. The first element's ordinal position in 0, second is 1, and so on. Indexes are not meaningful for sets.
-
Unless otherwise stated, functions operate on one-dimensional (1D) collections only. To use multidimensional arrays, you must first dereference to a 1D array type. Sets can only be one-dimensional.
1 - APPLY_AVG
Returns the average of all elements in a with numeric values.
Returns the average of all elements in a collection (array or set) with numeric values.
Behavior type
Immutable
Syntax
APPLY_AVG(collection)
Arguments
collection
- Target collection
Null-handling
The following cases return NULL:
-
if the input collection is NULL
-
if the input collection contains only null values
-
if the input collection is empty
If the input collection contains a mix of null and non-null elements, only the non-null values are considered in the calculation of the average.
Examples
=> SELECT apply_avg(ARRAY[1,2.4,5,6]);
apply_avg
-----------
3.6
(1 row)
See also
2 - APPLY_COUNT (ARRAY_COUNT)
Returns the total number of non-null elements in a.
Returns the total number of non-null elements in a collection (array or set). To count all elements including nulls, use APPLY_COUNT_ELEMENTS (ARRAY_LENGTH).
Behavior type
Immutable
Syntax
APPLY_COUNT(collection)
ARRAY_COUNT is a synonym of APPLY_COUNT.
Arguments
collection
- Target collection
Null-handling
Null values are not included in the count.
Examples
The array in this example contains six elements, one of which is null:
=> SELECT apply_count(ARRAY[1,NULL,3,7,8,5]);
apply_count
-------------
5
(1 row)
3 - APPLY_COUNT_ELEMENTS (ARRAY_LENGTH)
Returns the total number of elements in a , including NULLs.
Returns the total number of elements in a collection (array or set), including NULLs. To count only non-null values, use APPLY_COUNT (ARRAY_COUNT).
Behavior type
Immutable
Syntax
APPLY_COUNT_ELEMENTS(collection)
ARRAY_LENGTH is a synonym of APPLY_COUNT_ELEMENTS.
Arguments
collection
- Target collection
Null-handling
This function counts all members, including nulls.
An empty collection (ARRAY[]
or SET[]
) has a length of 0. A collection containing a single null (ARRAY[null]
or SET[null]
) has a length of 1.
Examples
The following array has six elements including one null:
=> SELECT apply_count_elements(ARRAY[1,NULL,3,7,8,5]);
apply_count_elements
---------------------
6
(1 row)
As the previous example shows, a null element is an element. Thus, an array containing only a null element has one element:
=> SELECT apply_count_elements(ARRAY[null]);
apply_count_elements
---------------------
1
(1 row)
A set does not contain duplicates. If you construct a set and pass it directly to this function, the result could differ from the number of inputs:
=> SELECT apply_count_elements(SET[1,1,3]);
apply_count_elements
---------------------
2
(1 row)
4 - APPLY_MAX
Returns the largest non-null element in a.
Returns the largest non-null element in a collection (array or set). This function is similar to the MAX [aggregate] function; APPLY_MAX operates on elements of a collection and MAX operates on an expression such as a column selection.
Behavior type
Immutable
Syntax
APPLY_MAX(collection)
Arguments
collection
- Target collection
Null-handling
This function ignores null elements. If all elements are null or the collection is empty, this function returns null.
Examples
=> SELECT apply_max(ARRAY[1,3.4,15]);
apply_max
-----------
15.0
(1 row)
5 - APPLY_MIN
Returns the smallest non-null element in a.
Returns the smallest non-null element in a collection (array or set). This function is similar to the MIN [aggregate] function; APPLY_MIN operates on elements of a collection and MIN operates on an expression such as a column selection.
Behavior type
Immutable
Syntax
APPLY_MIN(collection)
Arguments
collection
- Target collection
Null-handling
This function ignores null elements. If all elements are null or the collection is empty, this function returns null.
Examples
=> SELECT apply_min(ARRAY[1,3.4,15]);
apply_min
-----------
1.0
(1 row)
6 - APPLY_SUM
Computes the sum of all elements in a of numeric values (INTEGER, FLOAT, NUMERIC, or INTERVAL).
Computes the sum of all elements in a collection (array or set) of numeric values (INTEGER, FLOAT, NUMERIC, or INTERVAL).
Behavior type
Immutable
Syntax
APPLY_SUM(collection)
Arguments
collection
- Target collection
Null-handling
The following cases return NULL:
-
if the input collection is NULL
-
if the input collection contains only null values
-
if the input collection is empty
Examples
=> SELECT apply_sum(ARRAY[12.5,3,4,1]);
apply_sum
-----------
20.5
(1 row)
See also
7 - ARRAY_CAT
Concatenates two arrays of the same element type and dimensionality.
Concatenates two arrays of the same element type and dimensionality. For example, ROW elements must have the same fields.
If the inputs are both bounded, the bound for the result is the sum of the bounds of the inputs.
If any input is unbounded, the result is unbounded with a binary size that is the sum of the sizes of the inputs.
Behavior type
Immutable
Syntax
ARRAY_CAT(array1,array2)
Arguments
array1
, array2
- Arrays of matching dimensionality and element type
Null-handling
If either input is NULL, the function returns NULL.
Examples
Types are coerced if necessary, as shown in the second example.
=> SELECT array_cat(ARRAY[1,2], ARRAY[3,4,5]);
array_cat
-----------------------
[1,2,3,4,5]
(1 row)
=> SELECT array_cat(ARRAY[1,2], ARRAY[3,4,5.0]);
array_cat
-----------------------
["1.0","2.0","3.0","4.0","5.0"]
(1 row)
8 - ARRAY_CONTAINS
Returns true if the specified element is found in the array and false if not.
Returns true if the specified element is found in the array and false if not. Both arguments must be non-null, but the array may be empty.
Deprecated
This function has been renamed to
CONTAINS.
9 - ARRAY_DIMS
Returns the dimensionality of the input array.
Returns the dimensionality of the input array.
Behavior type
Immutable
Syntax
ARRAY_DIMS(array)
Arguments
array
- Target array
Examples
=> SELECT array_dims(ARRAY[[1,2],[2,3]]);
array_dims
------------
2
(1 row)
10 - ARRAY_FIND
Returns the ordinal position of a specified element in an array, or -1 if not found.
Returns the ordinal position of a specified element in an array, or -1 if not found. This function uses null-safe equality checks when testing elements.
Behavior type
Immutable
Syntax
ARRAY_FIND(array, { value | lambda-expression })
Arguments
array
- Target array.
value
- Value to search for; type must match or be coercible to the element type of the array.
lambda-expression
Lambda function to apply to each element. The function must return a Boolean value. The first argument to the function is the element, and the optional second element is the index of the element.
Examples
The function returns the first occurrence of the specified element. However, nothing ensures that value is unique in the array:
=> SELECT array_find(ARRAY[1,2,7,5,7],7);
array_find
------------
2
(1 row)
The function returns -1 if the specified element is not found:
=> SELECT array_find(ARRAY[1,3,5,7],4);
array_find
------------
-1
(1 row)
You can search for complex element types:
=> SELECT ARRAY_FIND(ARRAY[ARRAY[1,2,3],ARRAY[1,null,4]],
ARRAY[1,2,3]);
ARRAY_FIND
------------
0
(1 row)
=> SELECT ARRAY_FIND(ARRAY[ARRAY[1,2,3],ARRAY[1,null,4]],
ARRAY[1,null,4]);
ARRAY_FIND
------------
1
(1 row)
The second example, comparing arrays with null elements, finds a match because ARRAY_FIND uses a null-safe equality check when evaluating elements.
Lambdas
Consider a table of departments where each department has an array of ROW elements representing employees. The following example searches for a specific employee name in those records. The results show that Alice works (or has worked) for two departments:
=> SELECT deptID, ARRAY_FIND(employees, e -> e.name = 'Alice Adams') AS 'has_alice'
FROM departments;
deptID | has_alice
--------+-----------
1 | 0
2 | -1
3 | 0
(3 rows)
In the following example, each person in the table has an array of email addresses, and the function locates fake addresses. The function takes one argument, the array element to test, and calls a regular-expression function that returns a Boolean:
=> SELECT name, ARRAY_FIND(email, e -> REGEXP_LIKE(e,'example.com','i'))
AS 'example.com'
FROM people;
name | example.com
----------------+-------------
Elaine Jackson | -1
Frank Adams | 0
Lee Jones | -1
M Smith | 0
(4 rows)
See also
11 - CONTAINS
Returns true if the specified element is found in the collection and false if not.
Returns true if the specified element is found in the collection and false if not. This function uses null-safe equality checks when testing elements.
Behavior type
Immutable
Syntax
CONTAINS(collection, { value | lambda-expression })
Arguments
collection
- Target collection (ARRAY or SET).
value
- Value to search for; type must match or be coercible to the element type of the collection.
lambda-expression
Lambda function to apply to each element. The function must return a Boolean value. The first argument to the function is the element, and the optional second element is the index of the element.
Examples
=> SELECT CONTAINS(SET[1,2,3,4],2);
contains
----------
t
(1 row)
You can search for NULL as an element value:
=> SELECT CONTAINS(ARRAY[1,null,2],null);
contains
----------
t
(1 row)
You can search for complex element types:
=> SELECT CONTAINS(ARRAY[ARRAY[1,2,3],ARRAY[1,null,4]],
ARRAY[1,2,3]);
CONTAINS
----------
t
(1 row)
=> SELECT CONTAINS(ARRAY[ARRAY[1,2,3],ARRAY[1,null,4]],
ARRAY[1,null,4]);
CONTAINS
----------
t
(1 row)
The second example, comparing arrays with null elements, returns true because CONTAINS uses a null-safe equality check when evaluating elements.
In the following example, the orders table has the following definition:
=> CREATE EXTERNAL TABLE orders(
orderid int,
accountid int,
shipments Array[
ROW(
shipid int,
address ROW(
street varchar,
city varchar,
zip int
),
shipdate date
)
]
) AS COPY FROM '...' PARQUET;
The following query tests for a specific order. When passing a ROW literal as the second argument, cast any ambiguous fields to ensure type matches:
=> SELECT CONTAINS(shipments,
ROW(1,ROW('911 San Marcos St'::VARCHAR,
'Austin'::VARCHAR, 73344),
'2020-11-05'::DATE))
FROM orders;
CONTAINS
----------
t
f
f
(3 rows)
Lambdas
Consider a table of departments where each department has an array of ROW elements representing employees. The following query finds departments with early hires (low employee IDs):
=> SELECT deptID FROM departments
WHERE CONTAINS(employees, e -> e.id < 20);
deptID
--------
1
3
(2 rows)
In the following example, a schedules table includes an array of events, where each event is a ROW with several fields:
=> CREATE TABLE schedules
(guest VARCHAR,
events ARRAY[ROW(e_date DATE, e_name VARCHAR, price NUMERIC(8,2))]);
You can use the CONTAINS function with a lambda expression to find people who have more than one event on the same day. The second argument, idx
, is the index of the current element:
=> SELECT guest FROM schedules
WHERE CONTAINS(events, (e, idx) ->
(idx < ARRAY_LENGTH(events) - 1)
AND (e.e_date = events[idx + 1].e_date));
guest
-------------
Alice Adams
(1 row)
See also
12 - EXPLODE
Expands the elements of one or more collection columns (ARRAY or SET) into individual table rows, one row per element.
Expands the elements of one or more collection columns (ARRAY or SET) into individual table rows, one row per element. For each exploded collection, the results include two columns, one for the element index, and one for the value at that position. If the function explodes a single collection, these columns are named position
and value
by default. If the function explodes two or more collections, the columns for each collection are named pos_
column-name
and val_
column-name
. You can use an AS clause in the SELECT to change these column names.
EXPLODE is similar to UNNEST, which returns values but not positions.
By default, EXPLODE requires an OVER clause. If you set the skip_partitioning
parameter to true, an OVER clause is not required and is ignored if present.
Behavior type
Immutable
Syntax
EXPLODE (column[,...] [USING PARAMETERS param=value])
[ OVER ( [window-partition-clause] ) ]
Arguments
column
- Column in the table being queried. You must specify at least as many collection columns as the value of the
explode_count
parameter. Columns that are not collections are passed through without modification.
Passthrough columns are not needed if skip_partitioning
is true.
OVER(...)
- How to partition and sort input data. The input data is the result set that the query returns after it evaluates FROM, WHERE, GROUP BY, and HAVING clauses. For EXPLODE, use OVER() or OVER(PARTITION BEST).
This clause is ignored if skip_partitioning
is true.
Parameters
explode_count
- The number of collection columns to explode. The function checks each column, up to this value, and either explodes it if is a collection or passes it through if it is not a collection or if this limit has been reached. If the value of
explode_count
is greater than the number of collection columns specified, the function returns an error.
Default: 1
skip_partitioning
- Whether to skip partitioning and ignore the OVER clause if present. EXPLODE translates a single row of input into multiple rows of output, one per collection element. There is, therefore, usually no benefit to partitioning the input first. Skipping partitioning can help a query avoid an expensive sort or merge operation. Even so, setting this parameter can negatively affect performance in rare cases.
Default: false
Null-handling
This function expands each element in a collection into a row, including null elements. If the input column is NULL or an empty collection, the function produces no rows for that column:
=> SELECT EXPLODE(ARRAY[1,2,null,4]) OVER();
position | value
----------+-------
0 | 1
1 | 2
2 |
3 | 4
(4 rows)
=> SELECT EXPLODE(ARRAY[]::ARRAY[INT]) OVER();
position | value
----------+-------
(0 rows)
=> SELECT EXPLODE(NULL::ARRAY[INT]) OVER();
position | value
----------+-------
(0 rows)
Joining on results
To use JOIN with this function you must set the skip_partitioning
parameter, either in the function call or as a session parameter.
You can use the output of this function as if it were a relation by using CROSS JOIN or LEFT JOIN LATERAL in a query. Other JOIN types are not supported.
Consider the following table of students and exam scores:
=> SELECT * FROM tests;
student | scores | questions
---------+---------------+-----------------
Bob | [92,78,79] | [20,20,100]
Lee | |
Pat | [] | []
Sam | [97,98,85] | [20,20,100]
Tom | [68,75,82,91] | [20,20,100,100]
(5 rows)
The following query finds the best test scores across all students who have scores:
=> ALTER SESSION SET UDPARAMETER FOR ComplexTypesLib skip_partitioning = true;
=> SELECT student, score FROM tests
CROSS JOIN EXPLODE(scores) AS t (pos, score)
ORDER BY score DESC;
student | score
---------+-------
Sam | 98
Sam | 97
Bob | 92
Tom | 91
Sam | 85
Tom | 82
Bob | 79
Bob | 78
Tom | 75
Tom | 68
(10 rows)
The following query returns maximum and average per-question scores, considering both the exam score and the number of questions:
=> SELECT student, MAX(score/qcount), AVG(score/qcount) FROM tests
CROSS JOIN EXPLODE(scores, questions USING PARAMETERS explode_count=2)
AS t(pos_s, score, pos_q, qcount)
GROUP BY student;
student | MAX | AVG
---------+----------------------+------------------
Bob | 4.600000000000000000 | 3.04333333333333
Sam | 4.900000000000000000 | 3.42222222222222
Tom | 4.550000000000000000 | 2.37
(3 rows)
These queries produce results for three of the five students. One student has a null value for scores and another has an empty array. These rows are not included in the function's output.
To include null and empty arrays in output, use LEFT JOIN LATERAL instead of CROSS JOIN:
=> SELECT student, MIN(score), AVG(score) FROM tests
LEFT JOIN LATERAL EXPLODE(scores) AS t (pos, score)
GROUP BY student;
student | MIN | AVG
---------+-----+------------------
Bob | 78 | 83
Lee | |
Pat | |
Sam | 85 | 93.3333333333333
Tom | 68 | 79
(5 rows)
The LATERAL keyword is required with LEFT JOIN. It is optional for CROSS JOIN.
Examples
Consider an orders table with the following contents:
=> SELECT orderkey, custkey, prodkey, orderprices, email_addrs
FROM orders LIMIT 5;
orderkey | custkey | prodkey | orderprices | email_addrs
------------+---------+-----------------------------------------------+-----------------------------------+----------------------------------------------------------------------------------------------------------------
113-341987 | 342799 | ["MG-7190 ","VA-4028 ","EH-1247 ","MS-7018 "] | ["60.00","67.00","22.00","14.99"] | ["bob@example,com","robert.jones@example.com"]
111-952000 | 342845 | ["ID-2586 ","IC-9010 ","MH-2401 ","JC-1905 "] | ["22.00","35.00",null,"12.00"] | ["br92@cs.example.edu"]
111-345634 | 342536 | ["RS-0731 ","SJ-2021 "] | ["50.00",null] | [null]
113-965086 | 342176 | ["GW-1808 "] | ["108.00"] | ["joe.smith@example.com"]
111-335121 | 342321 | ["TF-3556 "] | ["50.00"] | ["789123@example-isp.com","alexjohnson@example.com","monica@eng.example.com","sara@johnson.example.name",null]
(5 rows)
The following query explodes the order prices for a single customer. The other two columns are passed through and are repeated for each returned row:
=> SELECT EXPLODE(orderprices, custkey, email_addrs
USING PARAMETERS skip_partitioning=true)
AS (position, orderprices, custkey, email_addrs)
FROM orders WHERE custkey='342845' ORDER BY orderprices;
position | orderprices | custkey | email_addrs
----------+-------------+---------+------------------------------
2 | | 342845 | ["br92@cs.example.edu",null]
3 | 12.00 | 342845 | ["br92@cs.example.edu",null]
0 | 22.00 | 342845 | ["br92@cs.example.edu",null]
1 | 35.00 | 342845 | ["br92@cs.example.edu",null]
(4 rows)
The previous example uses the skip_partitioning
parameter. Instead of setting it for each call to EXPLODE, you can set it as a session parameter. EXPLODE is part of the ComplexTypesLib UDx library. The following example returns the same results:
=> ALTER SESSION SET UDPARAMETER FOR ComplexTypesLib skip_partitioning=true;
=> SELECT EXPLODE(orderprices, custkey, email_addrs)
AS (position, orderprices, custkey, email_addrs)
FROM orders WHERE custkey='342845' ORDER BY orderprices;
You can explode more than one column by specifying the explode_count
parameter:
=> SELECT EXPLODE(orderkey, prodkey, orderprices
USING PARAMETERS explode_count=2, skip_partitioning=true)
AS (orderkey,pk_idx,pk_val,ord_idx,ord_val)
FROM orders
WHERE orderkey='113-341987';
orderkey | pk_idx | pk_val | ord_idx | ord_val
------------+--------+----------+---------+---------
113-341987 | 0 | MG-7190 | 0 | 60.00
113-341987 | 0 | MG-7190 | 1 | 67.00
113-341987 | 0 | MG-7190 | 2 | 22.00
113-341987 | 0 | MG-7190 | 3 | 14.99
113-341987 | 1 | VA-4028 | 0 | 60.00
113-341987 | 1 | VA-4028 | 1 | 67.00
113-341987 | 1 | VA-4028 | 2 | 22.00
113-341987 | 1 | VA-4028 | 3 | 14.99
113-341987 | 2 | EH-1247 | 0 | 60.00
113-341987 | 2 | EH-1247 | 1 | 67.00
113-341987 | 2 | EH-1247 | 2 | 22.00
113-341987 | 2 | EH-1247 | 3 | 14.99
113-341987 | 3 | MS-7018 | 0 | 60.00
113-341987 | 3 | MS-7018 | 1 | 67.00
113-341987 | 3 | MS-7018 | 2 | 22.00
113-341987 | 3 | MS-7018 | 3 | 14.99
(16 rows)
The following example uses a multi-dimensional array:
=> SELECT name, pingtimes FROM network_tests;
name | pingtimes
------+-------------------------------------------------------
eng1 | [[24.24,25.27,27.16,24.97],[23.97,25.01,28.12,29.5]]
eng2 | [[27.12,27.91,28.11,26.95],[29.01,28.99,30.11,31.56]]
qa1 | [[23.15,25.11,24.63,23.91],[22.85,22.86,23.91,31.52]]
(3 rows)
=> SELECT EXPLODE(name, pingtimes USING PARAMETERS explode_count=1) OVER()
FROM network_tests;
name | position | value
------+----------+---------------------------
eng1 | 0 | [24.24,25.27,27.16,24.97]
eng1 | 1 | [23.97,25.01,28.12,29.5]
eng2 | 0 | [27.12,27.91,28.11,26.95]
eng2 | 1 | [29.01,28.99,30.11,31.56]
qa1 | 0 | [23.15,25.11,24.63,23.91]
qa1 | 1 | [22.85,22.86,23.91,31.52]
(6 rows)
You can rewrite the previous query as follows to produce the same results:
=> SELECT name, EXPLODE(pingtimes USING PARAMETERS skip_partitioning=true)
FROM network_tests;
13 - FILTER
Takes an input array and returns an array containing only elements that meet a specified condition.
Takes an input array and returns an array containing only elements that meet a specified condition. This function uses null-safe equality checks when testing elements.
Behavior type
Immutable
Syntax
FILTER(array, lambda-expression )
Arguments
array
- Input array.
lambda-expression
Lambda function to apply to each element. The function must return a Boolean value. The first argument to the function is the element, and the optional second element is the index of the element.
Examples
Given a table that contains names and arrays of email addresses, the following query filters out fake email addresses and returns the rest:
=> SELECT name, FILTER(email, e -> NOT REGEXP_LIKE(e,'example.com','i')) AS 'real_email'
FROM people;
name | real_email
----------------+-------------------------------------------------
Elaine Jackson | ["ejackson@somewhere.org","elaine@jackson.com"]
Frank Adams | []
Lee Jones | ["lee.jones@somewhere.org"]
M Smith | ["ms@msmith.com"]
(4 rows)
You can use the results in a WHERE clause to exclude rows that no longer contain any email addresses:
=> SELECT name, FILTER(email, e -> NOT REGEXP_LIKE(e,'example.com','i')) AS 'real_email'
FROM people
WHERE ARRAY_LENGTH(real_email) > 0;
name | real_email
----------------+-------------------------------------------------
Elaine Jackson | ["ejackson@somewhere.org","elaine@jackson.com"]
Lee Jones | ["lee.jones@somewhere.org"]
M Smith | ["ms@msmith.com"]
(3 rows)
See also
14 - IMPLODE
Takes a column of any scalar type and returns an unbounded array.
Takes a column of any scalar type and returns an unbounded array. Combined with GROUP BY, this function can be used to reverse an EXPLODE operation.
Behavior type
-
Immutable if the WITHIN GROUP ORDER BY clause specifies a column or set of columns that resolves to unique element values within each output array group.
-
Volatile otherwise because results are non-commutative.
Syntax
IMPLODE (input-column [ USING PARAMETERS param=value[,...] ] )
[ within-group-order-by-clause ]
Arguments
input-column
- Column of any scalar type from which to create the array.
- [within-group-order-by-clause](/en/sql-reference/functions/aggregate-functions/within-group-order-by-clause/)
- Sorts elements within each output array group:
WITHIN GROUP (ORDER BY { column-expression[ sort-qualifiers ] }[,...])
sort-qualifiers
: { ASC | DESC [ NULLS { FIRST | LAST | AUTO } ] }
Tip
WITHIN GROUP ORDER BY can consume a large amount of memory per group. To minimize memory consumption, create projections that support
GROUPBY PIPELINED.
Parameters
allow_truncate
- Boolean, if true truncates results when output length exceeds column size. If false (the default), the function returns an error if the output array is too large.
Even if this parameter is set to true, IMPLODE returns an error if any single array element is too large. Truncation removes elements from the output array but does not alter individual elements.
max_binary_size
- The maximum binary size in bytes for the returned array. If you omit this parameter, IMPLODE uses the value of the configuration parameter DefaultArrayBinarySize.
Examples
Consider a table with the following contents:
=> SELECT * FROM filtered;
position | itemprice | itemkey
----------+-----------+---------
0 | 14.99 | 345
0 | 27.99 | 567
1 | 18.99 | 567
1 | 35.99 | 345
2 | 14.99 | 123
(5 rows)
The following query calls IMPLODE to assemble prices into arrays (grouped by keys):
=> SELECT itemkey AS key, IMPLODE(itemprice) AS prices
FROM filtered GROUP BY itemkey ORDER BY itemkey;
key | prices
-----+-------------------
123 | ["14.99"]
345 | ["35.99","14.99"]
567 | ["27.99","18.99"]
(3 rows)
You can modify this query by including a WITHIN GROUP ORDER BY clause, which specifies how to sort array elements within each group:
=> SELECT itemkey AS key, IMPLODE(itemprice) WITHIN GROUP (ORDER BY itemprice) AS prices
FROM filtered GROUP BY itemkey ORDER BY itemkey;
key | prices
-----+-------------------
123 | ["14.99"]
345 | ["14.99","35.99"]
567 | ["18.99","27.99"]
(3 rows)
See Arrays and sets (collections) for a fuller example.
15 - SET_UNION
Returns a SET containing all elements of two input sets.
Returns a SET containing all elements of two input sets.
If the inputs are both bounded, the bound for the result is the sum of the bounds of the inputs.
If any input is unbounded, the result is unbounded with a binary size that is the sum of the sizes of the inputs.
Behavior type
Immutable
Syntax
SET_UNION(set1,set2)
Arguments
set1
, set2
- Sets of matching element type
Null-handling
-
Null arguments are ignored. If one of the inputs is null, the function returns the non-null input. In other words, an argument of NULL is equivalent to SET[].
-
If both inputs are null, the function returns null.
Examples
=> SELECT SET_UNION(SET[1,2,4], SET[2,3,4,5.9]);
set_union
-----------------------
["1.0","2.0","3.0","4.0","5.9"]
(1 row)
16 - STRING_TO_ARRAY
Splits a string containing array values and returns a native one-dimensional array.
Splits a string containing array values and returns a native one-dimensional array. The output does not include the "ARRAY" keyword. This function does not support nested (multi-dimensional) arrays.
This function returns array elements as strings by default. You can cast to other types, as in the following example:
=> SELECT STRING_TO_ARRAY('[1,2,3]')::ARRAY[INT];
Behavior
Immutable
Syntax
STRING_TO_ARRAY(string [USING PARAMETERS param=value[,...]])
The following syntax is deprecated:
STRING_TO_ARRAY(string, delimiter)
Arguments
string
- String representation of a one-dimensional array; can be a VARCHAR or LONG VARCHAR column, a literal string, or the string output of an expression.
Spaces in the string are removed unless elements are individually quoted. For example, ' a,b,c'
is equivalent to 'a,b,c'
. To preserve the space, use '" a","b","c"'
.
Parameters
These parameters behave the same way as the corresponding options when loading delimited data (see DELIMITED).
No parameter may have the same value as any other parameter.
collection_delimiter
- The character or character sequence used to separate array elements (VARCHAR(8)). You can use any ASCII values in the range E'\000' to E'\177', inclusive.
Default: Comma (',').
collection_open
, collection_close
- The characters that mark the beginning and end of the array (VARCHAR(8)). It is an error to use these characters elsewhere within the list of elements without escaping them. These characters can be omitted from the input string.
Default: Square brackets ('[' and ']').
collection_null_element
- The string representing a null element value (VARCHAR(65000)). You can specify a null value using any ASCII values in the range E'\001' to E'\177' inclusive (any ASCII value except NULL: E'\000').
Default: 'null'
collection_enclose
- An optional quote character within which to enclose individual elements, allowing delimiter characters to be embedded in string values. You can choose any ASCII value in the range E'\001' to E'\177' inclusive (any ASCII character except NULL: E'\000'). Elements do not need to be enclosed by this value.
Default: double quote ('"')
Examples
The function uses comma as the default delimiter. You can specify a different value:
=> SELECT STRING_TO_ARRAY('[1,3,5]');
STRING_TO_ARRAY
-----------------
["1","3","5"]
(1 row)
=> SELECT STRING_TO_ARRAY('[t|t|f|t]' USING PARAMETERS collection_delimiter = '|');
STRING_TO_ARRAY
-------------------
["t","t","f","t"]
(1 row)
The bounding brackets are optional:
=> SELECT STRING_TO_ARRAY('t|t|f|t' USING PARAMETERS collection_delimiter = '|');
STRING_TO_ARRAY
-------------------
["t","t","f","t"]
(1 row)
The input can use other characters for open and close:
=> SELECT STRING_TO_ARRAY('{NASA-1683,NASA-7867,SPX-76}' USING PARAMETERS collection_open = '{', collection_close = '}');
STRING_TO_ARRAY
------------------------------------
["NASA-1683","NASA-7867","SPX-76"]
(1 row)
By default the string 'null' in input is treated as a null value:
=> SELECT STRING_TO_ARRAY('{"us-1672",null,"darpa-1963"}' USING PARAMETERS collection_open = '{', collection_close = '}');
STRING_TO_ARRAY
-------------------------------
["us-1672",null,"darpa-1963"]
(1 row)
In the following example, the input comes from a column:
=> SELECT STRING_TO_ARRAY(name USING PARAMETERS collection_delimiter=' ') FROM employees;
STRING_TO_ARRAY
-----------------------
["Howard","Wolowitz"]
["Sheldon","Cooper"]
(2 rows)
17 - TO_JSON
Returns the JSON representation of a complex-type argument, including mixed and nested complex types.
Returns the JSON representation of a complex-type argument, including mixed and nested complex types. This is the same format that queries of complex-type columns return.
Behavior
Immutable
Syntax
TO_JSON(value)
Arguments
value
- Column or literal of a complex type
Examples
These examples query the following table:
=> SELECT name, contact FROM customers;
name | contact
--------------------+-----------------------------------------------------------------------------------------------------------------------
Missy Cooper | {"street":"911 San Marcos St","city":"Austin","zipcode":73344,"email":["missy@mit.edu","mcooper@cern.gov"]}
Sheldon Cooper | {"street":"100 Main St Apt 4B","city":"Pasadena","zipcode":91001,"email":["shelly@meemaw.name","cooper@caltech.edu"]}
Leonard Hofstadter | {"street":"100 Main St Apt 4A","city":"Pasadena","zipcode":91001,"email":["hofstadter@caltech.edu"]}
Leslie Winkle | {"street":"23 Fifth Ave Apt 8C","city":"Pasadena","zipcode":91001,"email":[]}
Raj Koothrappali | {"street":null,"city":"Pasadena","zipcode":91001,"email":["raj@available.com"]}
Stuart Bloom |
(6 rows)
You can call TO_JSON on a column or on specific fields or array elements:
=> SELECT TO_JSON(contact) FROM customers;
to_json
-----------------------------------------------------------------------------------------------------------------------
{"street":"911 San Marcos St","city":"Austin","zipcode":73344,"email":["missy@mit.edu","mcooper@cern.gov"]}
{"street":"100 Main St Apt 4B","city":"Pasadena","zipcode":91001,"email":["shelly@meemaw.name","cooper@caltech.edu"]}
{"street":"100 Main St Apt 4A","city":"Pasadena","zipcode":91001,"email":["hofstadter@caltech.edu"]}
{"street":"23 Fifth Ave Apt 8C","city":"Pasadena","zipcode":91001,"email":[]}
{"street":null,"city":"Pasadena","zipcode":91001,"email":["raj@available.com"]}
(6 rows)
=> SELECT TO_JSON(contact.email) FROM customers;
to_json
---------------------------------------------
["missy@mit.edu","mcooper@cern.gov"]
["shelly@meemaw.name","cooper@caltech.edu"]
["hofstadter@caltech.edu"]
[]
["raj@available.com"]
(6 rows)
When calling TO_JSON with a SET, note that duplicates are removed and elements can be reordered:
=> SELECT TO_JSON(SET[1683,7867,76,76]);
TO_JSON
----------------
[76,1683,7867]
(1 row)
18 - UNNEST
Expands the elements of one or more collection columns (ARRAY or SET) into individual rows.
Expands the elements of one or more collection columns (ARRAY or SET) into individual rows. UNNEST is similar to EXPLODE, but UNNEST returns only the elements, while EXPLODE returns elements and their positions.
If called with a single array, UNNEST returns the elements in a column named value
. If called with two or more arrays, it returns columns named val_
column-name
. You can use an AS clause in the SELECT to change these names.
By default, UNNEST does not partition its input and ignores an OVER() clause if present.
Behavior type
Immutable
Syntax
UNNEST (column[,...])
[USING PARAMETERS param=value])
[ OVER ( [window-partition-clause
Arguments
column
- Collection column in the table being queried.
OVER(...)
- How to partition and sort input data. The input data is the result set that the query returns after it evaluates FROM, WHERE, GROUP BY, and HAVING clauses.
This clause only applies if skip_partitioning
is false.
Parameters
skip_partitioning
- Whether to skip partitioning and ignore the OVER clause if present. UNNEST translates a single row of input into multiple rows of output, one per collection element. There is, therefore, usually no benefit to partitioning the input first. Skipping partitioning can help a query avoid an expensive sort or merge operation.
Default: true
Null-handling
This function expands each element in a collection into a row, including null elements. If the input column is NULL or an empty collection, the function produces no rows for that column:
=> SELECT UNNEST(ARRAY[1,2,null,4]) OVER();
value
-------
1
2
4
(4 rows)
=> SELECT UNNEST(ARRAY[]::ARRAY[INT]) OVER();
value
-------
(0 rows)
=> SELECT UNNEST(NULL::ARRAY[INT]) OVER();
value
-------
(0 rows)
Joining on results
You can use the output of this function as if it were a relation by using CROSS JOIN or LEFT JOIN LATERAL in a query. Other JOIN types are not supported.
Consider the following table of students and exam scores:
=> SELECT * FROM tests;
student | scores | questions
---------+---------------+-----------------
Bob | [92,78,79] | [20,20,100]
Lee | |
Pat | [] | []
Sam | [97,98,85] | [20,20,100]
Tom | [68,75,82,91] | [20,20,100,100]
(5 rows)
The following query finds the best test scores across all students who have scores:
=> SELECT student, score FROM tests
CROSS JOIN UNNEST(scores) AS t (score)
ORDER BY score DESC;
student | score
---------+-------
Sam | 98
Sam | 97
Bob | 92
Tom | 91
Sam | 85
Tom | 82
Bob | 79
Bob | 78
Tom | 75
Tom | 68
(10 rows)
The following query returns maximum and average per-question scores, considering both the exam score and the number of questions:
=> SELECT student, MAX(score/qcount), AVG(score/qcount) FROM tests
CROSS JOIN UNNEST(scores, questions) AS t(score, qcount)
GROUP BY student;
student | MAX | AVG
---------+----------------------+------------------
Bob | 4.600000000000000000 | 3.04333333333333
Sam | 4.900000000000000000 | 3.42222222222222
Tom | 4.550000000000000000 | 2.37
(3 rows)
These queries produce results for three of the five students. One student has a null value for scores and another has an empty array. These rows are not included in the function's output.
To include null and empty arrays in output, use LEFT JOIN LATERAL instead of CROSS JOIN:
=> SELECT student, MIN(score), AVG(score) FROM tests
LEFT JOIN LATERAL UNNEST(scores) AS t (score)
GROUP BY student;
student | MIN | AVG
---------+-----+------------------
Bob | 78 | 83
Lee | |
Pat | |
Sam | 85 | 93.3333333333333
Tom | 68 | 79
(5 rows)
The LATERAL keyword is required with LEFT JOIN. It is optional for CROSS JOIN.
Examples
Consider a table with the following definition:
=> CREATE TABLE orders (
orderkey VARCHAR, custkey INT,
prodkey ARRAY[VARCHAR], orderprices ARRAY[DECIMAL(12,2)],
email_addrs ARRAY[VARCHAR]);
The following query expands one of the array columns. One of the elements is null:
=> SELECT UNNEST(orderprices) AS price, custkey, email_addrs
FROM orders WHERE custkey='342845' ORDER BY price;
price | custkey | email_addrs
-------+---------+-------------------------
| 342845 | ["br92@cs.example.edu"]
12.00 | 342845 | ["br92@cs.example.edu"]
22.00 | 342845 | ["br92@cs.example.edu"]
35.00 | 342845 | ["br92@cs.example.edu"]
(4 rows)
UNNEST can expand more than one column:
=> SELECT orderkey, UNNEST(prodkey, orderprices)
FROM orders WHERE orderkey='113-341987';
orderkey | val_prodkey | val_orderprices
------------+-------------+-----------------
113-341987 | MG-7190 | 60.00
113-341987 | MG-7190 | 67.00
113-341987 | MG-7190 | 22.00
113-341987 | MG-7190 | 14.99
113-341987 | VA-4028 | 60.00
113-341987 | VA-4028 | 67.00
113-341987 | VA-4028 | 22.00
113-341987 | VA-4028 | 14.99
113-341987 | EH-1247 | 60.00
113-341987 | EH-1247 | 67.00
113-341987 | EH-1247 | 22.00
113-341987 | EH-1247 | 14.99
113-341987 | MS-7018 | 60.00
113-341987 | MS-7018 | 67.00
113-341987 | MS-7018 | 22.00
113-341987 | MS-7018 | 14.99
(16 rows)