This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Collection functions

The functions in this section apply to collection types (arrays and sets).

1: APPLY_AVG
2: APPLY_COUNT (ARRAY_COUNT)
3: APPLY_COUNT_ELEMENTS (ARRAY_LENGTH)
4: APPLY_MAX
5: APPLY_MIN
6: APPLY_SUM
7: ARRAY_CAT
8: ARRAY_CONTAINS
9: ARRAY_DIMS
10: ARRAY_FIND
11: CONTAINS
12: EXPLODE
13: FILTER
14: IMPLODE
15: SET_UNION
16: STRING_TO_ARRAY
17: TO_JSON
18: UNNEST

The functions in this section apply to collection types (arrays and sets).

Some functions apply aggregation operations (such as sum) to collections. These function names all begin with APPLY.

Other functions in this section operate specifically on arrays or sets, as indicated on the individual reference pages. Array functions operate on both native array values and array values in external tables.

Notes

Arrays are 0-indexed. The first element's ordinal position in 0, second is 1, and so on. Indexes are not meaningful for sets.
Unless otherwise stated, functions operate on one-dimensional (1D) collections only. To use multidimensional arrays, you must first dereference to a 1D array type. Sets can only be one-dimensional.

1 - APPLY_AVG

Returns the average of all elements in a with numeric values.

Returns the average of all elements in a collection (array or set) with numeric values.

Behavior type

Immutable

Syntax

APPLY_AVG(collection)

Arguments

collection: Target collection

Null-handling

The following cases return NULL:

if the input collection is NULL
if the input collection contains only null values
if the input collection is empty

If the input collection contains a mix of null and non-null elements, only the non-null values are considered in the calculation of the average.

Examples

=> SELECT apply_avg(ARRAY[1,2.4,5,6]);
apply_avg
-----------
3.6
(1 row)

2 - APPLY_COUNT (ARRAY_COUNT)

Returns the total number of non-null elements in a.

Returns the total number of non-null elements in a collection (array or set). To count all elements including nulls, use APPLY_COUNT_ELEMENTS (ARRAY_LENGTH).

Behavior type

Immutable

Syntax

APPLY_COUNT(collection)

ARRAY_COUNT is a synonym of APPLY_COUNT.

Arguments

collection: Target collection

Null-handling

Null values are not included in the count.

Examples

The array in this example contains six elements, one of which is null:

=> SELECT apply_count(ARRAY[1,NULL,3,7,8,5]);
apply_count
-------------
5
(1 row)

3 - APPLY_COUNT_ELEMENTS (ARRAY_LENGTH)

Returns the total number of elements in a , including NULLs.

Returns the total number of elements in a collection (array or set), including NULLs. To count only non-null values, use APPLY_COUNT (ARRAY_COUNT).

Behavior type

Immutable

Syntax

APPLY_COUNT_ELEMENTS(collection)

ARRAY_LENGTH is a synonym of APPLY_COUNT_ELEMENTS.

Arguments

collection: Target collection

Null-handling

This function counts all members, including nulls.

An empty collection (ARRAY[] or SET[]) has a length of 0. A collection containing a single null (ARRAY[null] or SET[null]) has a length of 1.

Examples

The following array has six elements including one null:

=> SELECT apply_count_elements(ARRAY[1,NULL,3,7,8,5]);
apply_count_elements
---------------------
                   6
(1 row)

As the previous example shows, a null element is an element. Thus, an array containing only a null element has one element:

=> SELECT apply_count_elements(ARRAY[null]);
apply_count_elements
---------------------
                   1
(1 row)

A set does not contain duplicates. If you construct a set and pass it directly to this function, the result could differ from the number of inputs:

=> SELECT apply_count_elements(SET[1,1,3]);
apply_count_elements
---------------------
                   2
(1 row)

4 - APPLY_MAX

Returns the largest non-null element in a.

Returns the largest non-null element in a collection (array or set). This function is similar to the MAX [aggregate] function; APPLY_MAX operates on elements of a collection and MAX operates on an expression such as a column selection.

Behavior type

Immutable

Syntax

APPLY_MAX(collection)

Arguments

collection: Target collection

Null-handling

This function ignores null elements. If all elements are null or the collection is empty, this function returns null.

Examples

=> SELECT apply_max(ARRAY[1,3.4,15]);
apply_max
-----------
     15.0
(1 row)

5 - APPLY_MIN

Returns the smallest non-null element in a.

Returns the smallest non-null element in a collection (array or set). This function is similar to the MIN [aggregate] function; APPLY_MIN operates on elements of a collection and MIN operates on an expression such as a column selection.

Behavior type

Immutable

Syntax

APPLY_MIN(collection)

Arguments

collection: Target collection

Null-handling

This function ignores null elements. If all elements are null or the collection is empty, this function returns null.

Examples

=> SELECT apply_min(ARRAY[1,3.4,15]);
apply_min
-----------
       1.0
(1 row)

6 - APPLY_SUM

Computes the sum of all elements in a of numeric values (INTEGER, FLOAT, NUMERIC, or INTERVAL).

Computes the sum of all elements in a collection (array or set) of numeric values (INTEGER, FLOAT, NUMERIC, or INTERVAL).

Behavior type

Immutable

Syntax

APPLY_SUM(collection)

Arguments

collection: Target collection

Null-handling

The following cases return NULL:

if the input collection is NULL
if the input collection contains only null values
if the input collection is empty

Examples

=> SELECT apply_sum(ARRAY[12.5,3,4,1]);
apply_sum
-----------
      20.5
(1 row)

7 - ARRAY_CAT

Concatenates two arrays of the same element type and dimensionality.

Concatenates two arrays of the same element type and dimensionality. For example, ROW elements must have the same fields.

If the inputs are both bounded, the bound for the result is the sum of the bounds of the inputs.

If any input is unbounded, the result is unbounded with a binary size that is the sum of the sizes of the inputs.

Behavior type

Immutable

Syntax

ARRAY_CAT(array1,array2)

Arguments

array1, array2: Arrays of matching dimensionality and element type

Null-handling

If either input is NULL, the function returns NULL.

Examples

Types are coerced if necessary, as shown in the second example.

=> SELECT array_cat(ARRAY[1,2], ARRAY[3,4,5]);
array_cat
-----------------------
[1,2,3,4,5]
(1 row)

=> SELECT array_cat(ARRAY[1,2], ARRAY[3,4,5.0]);
array_cat
-----------------------
["1.0","2.0","3.0","4.0","5.0"]
(1 row)

8 - ARRAY_CONTAINS

Returns true if the specified element is found in the array and false if not.

Returns true if the specified element is found in the array and false if not. Both arguments must be non-null, but the array may be empty.

Deprecated

This function has been renamed to CONTAINS.

9 - ARRAY_DIMS

Returns the dimensionality of the input array.

Behavior type

Immutable

Syntax

ARRAY_DIMS(array)

Arguments

array: Target array

Examples

=> SELECT array_dims(ARRAY[[1,2],[2,3]]);
array_dims
------------
        2
(1 row)

10 - ARRAY_FIND

Returns the ordinal position of a specified element in an array, or -1 if not found.

Returns the ordinal position of a specified element in an array, or -1 if not found. This function uses null-safe equality checks when testing elements.

Behavior type

Immutable

Syntax

ARRAY_FIND(array, { value | lambda-expression })

Arguments

array: Target array.
value: Value to search for; type must match or be coercible to the element type of the array.
lambda-expression: Lambda function to apply to each element. The function must return a Boolean value. The first argument to the function is the element, and the optional second element is the index of the element.

Examples

The function returns the first occurrence of the specified element. However, nothing ensures that value is unique in the array:

=> SELECT array_find(ARRAY[1,2,7,5,7],7);
 array_find
------------
          2
(1 row)

The function returns -1 if the specified element is not found:

=> SELECT array_find(ARRAY[1,3,5,7],4);
array_find
------------
        -1
(1 row)

You can search for complex element types:

=> SELECT ARRAY_FIND(ARRAY[ARRAY[1,2,3],ARRAY[1,null,4]],
                     ARRAY[1,2,3]);
 ARRAY_FIND
------------
          0
(1 row)

=> SELECT ARRAY_FIND(ARRAY[ARRAY[1,2,3],ARRAY[1,null,4]],
                     ARRAY[1,null,4]);
 ARRAY_FIND
------------
          1
(1 row)

The second example, comparing arrays with null elements, finds a match because ARRAY_FIND uses a null-safe equality check when evaluating elements.

Lambdas

Consider a table of departments where each department has an array of ROW elements representing employees. The following example searches for a specific employee name in those records. The results show that Alice works (or has worked) for two departments:

=> SELECT deptID, ARRAY_FIND(employees, e -> e.name = 'Alice Adams') AS 'has_alice'
   FROM departments;
 deptID | has_alice
--------+-----------
      1 |         0
      2 |        -1
      3 |         0
(3 rows)

In the following example, each person in the table has an array of email addresses, and the function locates fake addresses. The function takes one argument, the array element to test, and calls a regular-expression function that returns a Boolean:

=> SELECT name, ARRAY_FIND(email, e -> REGEXP_LIKE(e,'example.com','i'))
                AS 'example.com'
   FROM people;
      name      | example.com
----------------+-------------
 Elaine Jackson |          -1
 Frank Adams    |           0
 Lee Jones      |          -1
 M Smith        |           0
(4 rows)

11 - CONTAINS

Returns true if the specified element is found in the collection and false if not.

Returns true if the specified element is found in the collection and false if not. This function uses null-safe equality checks when testing elements.

Behavior type

Immutable

Syntax

CONTAINS(collection, { value | lambda-expression })

Arguments

collection: Target collection (ARRAY or SET).
value: Value to search for; type must match or be coercible to the element type of the collection.
lambda-expression: Lambda function to apply to each element. The function must return a Boolean value. The first argument to the function is the element, and the optional second element is the index of the element.

Examples

=> SELECT CONTAINS(SET[1,2,3,4],2);
 contains
----------
t
(1 row)

You can search for NULL as an element value:

=> SELECT CONTAINS(ARRAY[1,null,2],null);
 contains
----------
 t
(1 row)

You can search for complex element types:

=> SELECT CONTAINS(ARRAY[ARRAY[1,2,3],ARRAY[1,null,4]],
                   ARRAY[1,2,3]);
 CONTAINS
----------
 t
(1 row)

=> SELECT CONTAINS(ARRAY[ARRAY[1,2,3],ARRAY[1,null,4]],
                   ARRAY[1,null,4]);
 CONTAINS
----------
 t
(1 row)

The second example, comparing arrays with null elements, returns true because CONTAINS uses a null-safe equality check when evaluating elements.

In the following example, the orders table has the following definition:

=> CREATE EXTERNAL TABLE orders(
  orderid int,
  accountid int,
  shipments Array[
    ROW(
      shipid int,
      address ROW(
        street varchar,
        city varchar,
        zip int
      ),
      shipdate date
    )
  ]
 ) AS COPY FROM '...' PARQUET;

The following query tests for a specific order. When passing a ROW literal as the second argument, cast any ambiguous fields to ensure type matches:

=> SELECT CONTAINS(shipments,
            ROW(1,ROW('911 San Marcos St'::VARCHAR,
                  'Austin'::VARCHAR, 73344),
            '2020-11-05'::DATE))
   FROM orders;
 CONTAINS
----------
 t
 f
 f
(3 rows)

Lambdas

Consider a table of departments where each department has an array of ROW elements representing employees. The following query finds departments with early hires (low employee IDs):

=> SELECT deptID FROM departments
   WHERE CONTAINS(employees, e -> e.id < 20);
 deptID
--------
      1
      3
(2 rows)

In the following example, a schedules table includes an array of events, where each event is a ROW with several fields:

=> CREATE TABLE schedules
       (guest VARCHAR,
       events ARRAY[ROW(e_date DATE, e_name VARCHAR, price NUMERIC(8,2))]);

You can use the CONTAINS function with a lambda expression to find people who have more than one event on the same day. The second argument, idx, is the index of the current element:

=> SELECT guest FROM schedules
WHERE CONTAINS(events, (e, idx) ->
                       (idx < ARRAY_LENGTH(events) - 1)
                       AND (e.e_date = events[idx + 1].e_date));
    guest
-------------
 Alice Adams
(1 row)

12 - EXPLODE

Expands the elements of one or more collection columns (ARRAY or SET) into individual table rows, one row per element.

Expands the elements of one or more collection columns (ARRAY or SET) into individual table rows, one row per element. For each exploded collection, the results include two columns, one for the element index, and one for the value at that position. If the function explodes a single collection, these columns are named position and value by default. If the function explodes two or more collections, the columns for each collection are named pos_column-name and val_column-name. You can use an AS clause in the SELECT to change these column names.

EXPLODE and UNNEST both expand collections. They have the following differences:

By default, EXPLODE expands only the first collection it is passed and UNNEST expands all of them. See the explode_count and explode_all parameters.
By default, EXPLODE returns element positions in an index column and UNNEST does not. See the with_offsets parameter.
By default, EXPLODE requires an OVER clause and UNNEST ignores an OVER clause if present. See the skip_partitioning parameter.

Behavior type

Immutable

Syntax

EXPLODE (column[,...] [USING PARAMETERS param=value])
[ OVER ( [window-partition-clause] ) ]

Arguments

column: Column in the table being queried. Unless explode_all is true, you must specify at least as many collection columns as the value of the explode_count parameter. Columns that are not collections are passed through without modification.
Passthrough columns are not needed if skip_partitioning is true.
OVER(...): How to partition and sort input data. The input data is the result set that the query returns after it evaluates FROM, WHERE, GROUP BY, and HAVING clauses. For EXPLODE, use OVER() or OVER(PARTITION BEST).
This clause is ignored if skip_partitioning is true.

Parameters

explode_all (BOOLEAN)

If true, explode all collection columns. When explode_all is true, passthrough columns are not permitted.

Default: false

explode_count (INT)

The number of collection columns to explode. The function checks each column, up to this value, and either explodes it if is a collection or passes it through if it is not a collection or if this limit has been reached. If the value of explode_count is greater than the number of collection columns specified, the function returns an error.

If explode_all is true, you cannot specify explode_count.

Default: 1

skip_partitioning (BOOLEAN)

Whether to skip partitioning and ignore the OVER clause if present. EXPLODE translates a single row of input into multiple rows of output, one per collection element. There is, therefore, usually no benefit to partitioning the input first. Skipping partitioning can help a query avoid an expensive sort or merge operation. Even so, setting this parameter can negatively affect performance in rare cases.

Default: false

with_offset (BOOLEAN)

Whether to return the index of each element.

Default: true

Null-handling

This function expands each element in a collection into a row, including null elements. If the input column is NULL or an empty collection, the function produces no rows for that column:

=> SELECT EXPLODE(ARRAY[1,2,null,4]) OVER();
 position | value
----------+-------
        0 |     1
        1 |     2
        2 |
        3 |     4
(4 rows)

=> SELECT EXPLODE(ARRAY[]::ARRAY[INT]) OVER();
 position | value
----------+-------
(0 rows)

=> SELECT EXPLODE(NULL::ARRAY[INT]) OVER();
 position | value
----------+-------
(0 rows)

Joining on results

To use JOIN with this function you must set the skip_partitioning parameter, either in the function call or as a session parameter.

You can use the output of this function as if it were a relation by using CROSS JOIN or LEFT JOIN LATERAL in a query. Other JOIN types are not supported.

Consider the following table of students and exam scores:

=> SELECT * FROM tests;
 student |    scores     |    questions
---------+---------------+-----------------
 Bob     | [92,78,79]    | [20,20,100]
 Lee     |               |
 Pat     | []            | []
 Sam     | [97,98,85]    | [20,20,100]
 Tom     | [68,75,82,91] | [20,20,100,100]
(5 rows)

The following query finds the best test scores across all students who have scores:

=> ALTER SESSION SET UDPARAMETER FOR ComplexTypesLib skip_partitioning = true;

=> SELECT student, score FROM tests
   CROSS JOIN EXPLODE(scores) AS t (pos, score)
   ORDER BY score DESC;
 student | score
---------+-------
 Sam     |    98
 Sam     |    97
 Bob     |    92
 Tom     |    91
 Sam     |    85
 Tom     |    82
 Bob     |    79
 Bob     |    78
 Tom     |    75
 Tom     |    68
(10 rows)

The following query returns maximum and average per-question scores, considering both the exam score and the number of questions:

=> SELECT student, MAX(score/qcount), AVG(score/qcount) FROM tests
   CROSS JOIN EXPLODE(scores, questions USING PARAMETERS explode_count=2)
      AS t(pos_s, score, pos_q, qcount)
   GROUP BY student;
 student |         MAX          |       AVG
---------+----------------------+------------------
 Bob     | 4.600000000000000000 | 3.04333333333333
 Sam     | 4.900000000000000000 | 3.42222222222222
 Tom     | 4.550000000000000000 |             2.37
(3 rows)

These queries produce results for three of the five students. One student has a null value for scores and another has an empty array. These rows are not included in the function's output.

To include null and empty arrays in output, use LEFT JOIN LATERAL instead of CROSS JOIN:

=> SELECT student, MIN(score), AVG(score) FROM tests
LEFT JOIN LATERAL EXPLODE(scores) AS t (pos, score)
GROUP BY student;
 student | MIN |       AVG
---------+-----+------------------
 Bob     |  78 |               83
 Lee     |     |
 Pat     |     |
 Sam     |  85 | 93.3333333333333
 Tom     |  68 |               79
(5 rows)

The LATERAL keyword is required with LEFT JOIN. It is optional for CROSS JOIN.

Examples

Consider an orders table with the following contents:

=> SELECT orderkey, custkey, prodkey, orderprices, email_addrs
   FROM orders LIMIT 5;
  orderkey  | custkey |                    prodkey                    |            orderprices            |                                                  email_addrs
------------+---------+-----------------------------------------------+-----------------------------------+----------------------------------------------------------------------------------------------------------------
 113-341987 |  342799 | ["MG-7190 ","VA-4028 ","EH-1247 ","MS-7018 "] | ["60.00","67.00","22.00","14.99"] | ["bob@example,com","robert.jones@example.com"]
 111-952000 |  342845 | ["ID-2586 ","IC-9010 ","MH-2401 ","JC-1905 "] | ["22.00","35.00",null,"12.00"]    | ["br92@cs.example.edu"]
 111-345634 |  342536 | ["RS-0731 ","SJ-2021 "]                       | ["50.00",null]                    | [null]
 113-965086 |  342176 | ["GW-1808 "]                                  | ["108.00"]                        | ["joe.smith@example.com"]
 111-335121 |  342321 | ["TF-3556 "]                                  | ["50.00"]                         | ["789123@example-isp.com","alexjohnson@example.com","monica@eng.example.com","sara@johnson.example.name",null]
(5 rows)

The following query explodes the order prices for a single customer. The other two columns are passed through and are repeated for each returned row:

=> SELECT EXPLODE(orderprices, custkey, email_addrs
                  USING PARAMETERS skip_partitioning=true)
            AS (position, orderprices, custkey, email_addrs)
   FROM orders WHERE custkey='342845' ORDER BY orderprices;
 position | orderprices | custkey |         email_addrs
----------+-------------+---------+------------------------------
        2 |             |  342845 | ["br92@cs.example.edu",null]
        3 |       12.00 |  342845 | ["br92@cs.example.edu",null]
        0 |       22.00 |  342845 | ["br92@cs.example.edu",null]
        1 |       35.00 |  342845 | ["br92@cs.example.edu",null]
(4 rows)

The previous example uses the skip_partitioning parameter. Instead of setting it for each call to EXPLODE, you can set it as a session parameter. EXPLODE is part of the ComplexTypesLib UDx library. The following example returns the same results:

=> ALTER SESSION SET UDPARAMETER FOR ComplexTypesLib skip_partitioning=true;

=> SELECT EXPLODE(orderprices, custkey, email_addrs)
            AS (position, orderprices, custkey, email_addrs)
   FROM orders WHERE custkey='342845' ORDER BY orderprices;

You can explode more than one column by specifying the explode_count parameter:

=> SELECT EXPLODE(orderkey, prodkey, orderprices
                  USING PARAMETERS explode_count=2, skip_partitioning=true)
          AS (orderkey,pk_idx,pk_val,ord_idx,ord_val)
   FROM orders
   WHERE orderkey='113-341987';
  orderkey  | pk_idx |  pk_val  | ord_idx | ord_val
------------+--------+----------+---------+---------
 113-341987 |      0 | MG-7190  |       0 |   60.00
 113-341987 |      0 | MG-7190  |       1 |   67.00
 113-341987 |      0 | MG-7190  |       2 |   22.00
 113-341987 |      0 | MG-7190  |       3 |   14.99
 113-341987 |      1 | VA-4028  |       0 |   60.00
 113-341987 |      1 | VA-4028  |       1 |   67.00
 113-341987 |      1 | VA-4028  |       2 |   22.00
 113-341987 |      1 | VA-4028  |       3 |   14.99
 113-341987 |      2 | EH-1247  |       0 |   60.00
 113-341987 |      2 | EH-1247  |       1 |   67.00
 113-341987 |      2 | EH-1247  |       2 |   22.00
 113-341987 |      2 | EH-1247  |       3 |   14.99
 113-341987 |      3 | MS-7018  |       0 |   60.00
 113-341987 |      3 | MS-7018  |       1 |   67.00
 113-341987 |      3 | MS-7018  |       2 |   22.00
 113-341987 |      3 | MS-7018  |       3 |   14.99
(16 rows)

The following example uses a multi-dimensional array:

=> SELECT name, pingtimes FROM network_tests;
 name |                       pingtimes
------+-------------------------------------------------------
 eng1 | [[24.24,25.27,27.16,24.97],[23.97,25.01,28.12,29.5]]
 eng2 | [[27.12,27.91,28.11,26.95],[29.01,28.99,30.11,31.56]]
 qa1  | [[23.15,25.11,24.63,23.91],[22.85,22.86,23.91,31.52]]
(3 rows)

=> SELECT EXPLODE(name, pingtimes USING PARAMETERS explode_count=1) OVER()
   FROM network_tests;
 name | position |           value
------+----------+---------------------------
 eng1 |        0 | [24.24,25.27,27.16,24.97]
 eng1 |        1 | [23.97,25.01,28.12,29.5]
 eng2 |        0 | [27.12,27.91,28.11,26.95]
 eng2 |        1 | [29.01,28.99,30.11,31.56]
 qa1  |        0 | [23.15,25.11,24.63,23.91]
 qa1  |        1 | [22.85,22.86,23.91,31.52]
(6 rows)

You can rewrite the previous query as follows to produce the same results:

=> SELECT name, EXPLODE(pingtimes USING PARAMETERS skip_partitioning=true)
   FROM network_tests;

13 - FILTER

Takes an input array and returns an array containing only elements that meet a specified condition.

Takes an input array and returns an array containing only elements that meet a specified condition. This function uses null-safe equality checks when testing elements.

Behavior type

Immutable

Syntax

FILTER(array, lambda-expression )

Arguments

array: Input array.
lambda-expression: Lambda function to apply to each element. The function must return a Boolean value. The first argument to the function is the element, and the optional second element is the index of the element.

Examples

Given a table that contains names and arrays of email addresses, the following query filters out fake email addresses and returns the rest:

=> SELECT name, FILTER(email, e -> NOT REGEXP_LIKE(e,'example.com','i')) AS 'real_email'
   FROM people;
      name      |                   real_email
----------------+-------------------------------------------------
 Elaine Jackson | ["ejackson@somewhere.org","elaine@jackson.com"]
 Frank Adams    | []
 Lee Jones      | ["lee.jones@somewhere.org"]
 M Smith        | ["ms@msmith.com"]
(4 rows)

You can use the results in a WHERE clause to exclude rows that no longer contain any email addresses:

=> SELECT name, FILTER(email, e -> NOT REGEXP_LIKE(e,'example.com','i')) AS 'real_email'
   FROM people
   WHERE ARRAY_LENGTH(real_email) > 0;
      name      |                   real_email
----------------+-------------------------------------------------
 Elaine Jackson | ["ejackson@somewhere.org","elaine@jackson.com"]
 Lee Jones      | ["lee.jones@somewhere.org"]
 M Smith        | ["ms@msmith.com"]
(3 rows)

14 - IMPLODE

Takes a column of any scalar type and returns an unbounded array.

Takes a column of any scalar type and returns an unbounded array. Combined with GROUP BY, this function can be used to reverse an EXPLODE operation.

Behavior type

Immutable if the WITHIN GROUP ORDER BY clause specifies a column or set of columns that resolves to unique element values within each output array group.
Volatile otherwise because results are non-commutative.

Syntax

IMPLODE (input-column [ USING PARAMETERS param=value[,...] ] )
    [ within-group-order-by-clause ]

Arguments

input-column

Column of any scalar type from which to create the array.

[within-group-order-by-clause](/en/sql-reference/functions/aggregate-functions/within-group-order-by-clause/)

Sorts elements within each output array group:

WITHIN GROUP (ORDER BY { column-expression[ sort-qualifiers ] }[,...])

sort-qualifiers: { ASC | DESC [ NULLS { FIRST | LAST | AUTO } ] }

Tip

WITHIN GROUP ORDER BY can consume a large amount of memory per group. To minimize memory consumption, create projections that support GROUPBY PIPELINED.

Parameters

allow_truncate: Boolean, if true truncates results when output length exceeds column size. If false (the default), the function returns an error if the output array is too large.
Even if this parameter is set to true, IMPLODE returns an error if any single array element is too large. Truncation removes elements from the output array but does not alter individual elements.
max_binary_size: The maximum binary size in bytes for the returned array. If you omit this parameter, IMPLODE uses the value of the configuration parameter DefaultArrayBinarySize.

Examples

Consider a table with the following contents:

=> SELECT * FROM filtered;

 position | itemprice | itemkey
----------+-----------+---------
        0 |     14.99 |     345
        0 |     27.99 |     567
        1 |     18.99 |     567
        1 |     35.99 |     345
        2 |     14.99 |     123
(5 rows)

The following query calls IMPLODE to assemble prices into arrays (grouped by keys):

=> SELECT itemkey AS key, IMPLODE(itemprice) AS prices
    FROM filtered GROUP BY itemkey ORDER BY itemkey;
 key |      prices
-----+-------------------
 123 | ["14.99"]
 345 | ["35.99","14.99"]
 567 | ["27.99","18.99"]
(3 rows)

You can modify this query by including a WITHIN GROUP ORDER BY clause, which specifies how to sort array elements within each group:

=> SELECT itemkey AS key, IMPLODE(itemprice) WITHIN GROUP (ORDER BY itemprice) AS prices
    FROM filtered GROUP BY itemkey ORDER BY itemkey;
 key |      prices
-----+-------------------
 123 | ["14.99"]
 345 | ["14.99","35.99"]
 567 | ["18.99","27.99"]
(3 rows)

See Arrays and sets (collections) for a fuller example.

15 - SET_UNION

Returns a SET containing all elements of two input sets.

If the inputs are both bounded, the bound for the result is the sum of the bounds of the inputs.

If any input is unbounded, the result is unbounded with a binary size that is the sum of the sizes of the inputs.

Behavior type

Immutable

Syntax

SET_UNION(set1,set2)

Arguments

set1, set2: Sets of matching element type

Null-handling

Null arguments are ignored. If one of the inputs is null, the function returns the non-null input. In other words, an argument of NULL is equivalent to SET[].
If both inputs are null, the function returns null.

Examples

=> SELECT SET_UNION(SET[1,2,4], SET[2,3,4,5.9]);
set_union
-----------------------
["1.0","2.0","3.0","4.0","5.9"]
(1 row)

16 - STRING_TO_ARRAY

Splits a string containing array values and returns a native one-dimensional array.

Splits a string containing array values and returns a native one-dimensional array. The output does not include the "ARRAY" keyword. This function does not support nested (multi-dimensional) arrays.

This function returns array elements as strings by default. You can cast to other types, as in the following example:

=> SELECT STRING_TO_ARRAY('[1,2,3]')::ARRAY[INT];

Behavior

Immutable

Syntax

STRING_TO_ARRAY(string [USING PARAMETERS param=value[,...]])

The following syntax is deprecated:

STRING_TO_ARRAY(string, delimiter)

Arguments

string: String representation of a one-dimensional array; can be a VARCHAR or LONG VARCHAR column, a literal string, or the string output of an expression.
Spaces in the string are removed unless elements are individually quoted. For example, ' a,b,c' is equivalent to 'a,b,c'. To preserve the space, use '" a","b","c"'.

Parameters

These parameters behave the same way as the corresponding options when loading delimited data (see DELIMITED).

No parameter may have the same value as any other parameter.

collection_delimiter: The character or character sequence used to separate array elements (VARCHAR(8)). You can use any ASCII values in the range E'\000' to E'\177', inclusive.
Default: Comma (',').
collection_open, collection_close: The characters that mark the beginning and end of the array (VARCHAR(8)). It is an error to use these characters elsewhere within the list of elements without escaping them. These characters can be omitted from the input string.
Default: Square brackets ('[' and ']').
collection_null_element: The string representing a null element value (VARCHAR(65000)). You can specify a null value using any ASCII values in the range E'\001' to E'\177' inclusive (any ASCII value except NULL: E'\000').
Default: 'null'
collection_enclose: An optional quote character within which to enclose individual elements, allowing delimiter characters to be embedded in string values. You can choose any ASCII value in the range E'\001' to E'\177' inclusive (any ASCII character except NULL: E'\000'). Elements do not need to be enclosed by this value.
Default: double quote ('"')

Examples

The function uses comma as the default delimiter. You can specify a different value:

=> SELECT STRING_TO_ARRAY('[1,3,5]');
 STRING_TO_ARRAY
-----------------
 ["1","3","5"]
(1 row)

=> SELECT STRING_TO_ARRAY('[t|t|f|t]' USING PARAMETERS collection_delimiter = '|');
  STRING_TO_ARRAY
-------------------
 ["t","t","f","t"]
(1 row)

The bounding brackets are optional:

=> SELECT STRING_TO_ARRAY('t|t|f|t' USING PARAMETERS collection_delimiter = '|');
  STRING_TO_ARRAY
-------------------
 ["t","t","f","t"]
(1 row)

The input can use other characters for open and close:

=> SELECT STRING_TO_ARRAY('{NASA-1683,NASA-7867,SPX-76}' USING PARAMETERS collection_open = '{', collection_close = '}');
          STRING_TO_ARRAY
------------------------------------
 ["NASA-1683","NASA-7867","SPX-76"]
(1 row)

By default the string 'null' in input is treated as a null value:

=> SELECT STRING_TO_ARRAY('{"us-1672",null,"darpa-1963"}' USING PARAMETERS collection_open = '{', collection_close = '}');
        STRING_TO_ARRAY
-------------------------------
 ["us-1672",null,"darpa-1963"]
(1 row)

In the following example, the input comes from a column:

=> SELECT STRING_TO_ARRAY(name USING PARAMETERS collection_delimiter=' ') FROM employees;
    STRING_TO_ARRAY
-----------------------
 ["Howard","Wolowitz"]
 ["Sheldon","Cooper"]
(2 rows)

17 - TO_JSON

Returns the JSON representation of a complex-type argument, including mixed and nested complex types.

Returns the JSON representation of a complex-type argument, including mixed and nested complex types. This is the same format that queries of complex-type columns return.

Behavior

Immutable

Syntax

TO_JSON(value)

Arguments

value: Column or literal of a complex type

Examples

These examples query the following table:

=> SELECT name, contact FROM customers;
        name        |                                                        contact
--------------------+-----------------------------------------------------------------------------------------------------------------------
Missy Cooper       | {"street":"911 San Marcos St","city":"Austin","zipcode":73344,"email":["missy@mit.edu","mcooper@cern.gov"]}
Sheldon Cooper     | {"street":"100 Main St Apt 4B","city":"Pasadena","zipcode":91001,"email":["shelly@meemaw.name","cooper@caltech.edu"]}
Leonard Hofstadter | {"street":"100 Main St Apt 4A","city":"Pasadena","zipcode":91001,"email":["hofstadter@caltech.edu"]}
Leslie Winkle      | {"street":"23 Fifth Ave Apt 8C","city":"Pasadena","zipcode":91001,"email":[]}
Raj Koothrappali   | {"street":null,"city":"Pasadena","zipcode":91001,"email":["raj@available.com"]}
Stuart Bloom       |
(6 rows)

You can call TO_JSON on a column or on specific fields or array elements:

=> SELECT TO_JSON(contact) FROM customers;
    to_json
-----------------------------------------------------------------------------------------------------------------------
{"street":"911 San Marcos St","city":"Austin","zipcode":73344,"email":["missy@mit.edu","mcooper@cern.gov"]}
{"street":"100 Main St Apt 4B","city":"Pasadena","zipcode":91001,"email":["shelly@meemaw.name","cooper@caltech.edu"]}
{"street":"100 Main St Apt 4A","city":"Pasadena","zipcode":91001,"email":["hofstadter@caltech.edu"]}
{"street":"23 Fifth Ave Apt 8C","city":"Pasadena","zipcode":91001,"email":[]}
{"street":null,"city":"Pasadena","zipcode":91001,"email":["raj@available.com"]}

(6 rows)

=> SELECT TO_JSON(contact.email) FROM customers;
    to_json
---------------------------------------------
["missy@mit.edu","mcooper@cern.gov"]
["shelly@meemaw.name","cooper@caltech.edu"]
["hofstadter@caltech.edu"]
[]
["raj@available.com"]

(6 rows)

When calling TO_JSON with a SET, note that duplicates are removed and elements can be reordered:

=> SELECT TO_JSON(SET[1683,7867,76,76]);
    TO_JSON
----------------
[76,1683,7867]
(1 row)

18 - UNNEST

Expands the elements of one or more collection columns (ARRAY or SET) into individual rows.

Expands the elements of one or more collection columns (ARRAY or SET) into individual rows. If called with a single array, UNNEST returns the elements in a column named value. If called with two or more arrays, it returns columns named val_column-name. You can use an AS clause in the SELECT to change these names.

UNNEST and EXPLODE both expand collections. They have the following differences:

By default, UNNEST expands all passed collections and EXPLODE expands only the first. See the explode_count and explode_all parameters.
By default, UNNEST returns only the elements and EXPLODE also returns their positions in an index column. See the with_offsets parameter.
By default, UNNEST does not partition its input and ignores an OVER() clause if present. See the skip_partitioning parameter.

Behavior type

Immutable

Syntax

UNNEST (column[,...])
 [USING PARAMETERS param=value])
[ OVER ( [window-partition-clause

Arguments

column: Column in the table being queried. If explode_all is false, you must specify at least as many collection columns as the value of the explode_count parameter. Columns that are not collections are passed through without modification.
Passthrough columns are not needed if skip_partitioning is true.
OVER(...): How to partition and sort input data. The input data is the result set that the query returns after it evaluates FROM, WHERE, GROUP BY, and HAVING clauses.
This clause only applies if skip_partitioning is false.

Parameters

explode_all (BOOLEAN)

If true, explode all collection columns. When explode_all is true, passthrough columns are not permitted.

Default: true

explode_count (INT)

If explode_all is true, you cannot specify explode_count.

Default: 1

skip_partitioning (BOOLEAN)

Whether to skip partitioning and ignore the OVER clause if present. UNNEST translates a single row of input into multiple rows of output, one per collection element. There is, therefore, usually no benefit to partitioning the input first. Skipping partitioning can help a query avoid an expensive sort or merge operation.

Default: true

with_offset (BOOLEAN)

Whether to return the index of each element as an additional column.

Default: false

Null-handling

This function expands each element in a collection into a row, including null elements. If the input column is NULL or an empty collection, the function produces no rows for that column:

=> SELECT UNNEST(ARRAY[1,2,null,4]) OVER();
 value
-------
     1
     2

     4
(4 rows)

=> SELECT UNNEST(ARRAY[]::ARRAY[INT]) OVER();
 value
-------
(0 rows)

=> SELECT UNNEST(NULL::ARRAY[INT]) OVER();
 value
-------
(0 rows)

Joining on results

You can use the output of this function as if it were a relation by using CROSS JOIN or LEFT JOIN LATERAL in a query. Other JOIN types are not supported.

Consider the following table of students and exam scores:

=> SELECT * FROM tests;
 student |    scores     |    questions
---------+---------------+-----------------
 Bob     | [92,78,79]    | [20,20,100]
 Lee     |               |
 Pat     | []            | []
 Sam     | [97,98,85]    | [20,20,100]
 Tom     | [68,75,82,91] | [20,20,100,100]
(5 rows)

The following query finds the best test scores across all students who have scores:

=> SELECT student, score FROM tests
CROSS JOIN UNNEST(scores) AS t (score)
ORDER BY score DESC;
 student | score
---------+-------
 Sam     |    98
 Sam     |    97
 Bob     |    92
 Tom     |    91
 Sam     |    85
 Tom     |    82
 Bob     |    79
 Bob     |    78
 Tom     |    75
 Tom     |    68
(10 rows)

The following query returns maximum and average per-question scores, considering both the exam score and the number of questions:

=> SELECT student, MAX(score/qcount), AVG(score/qcount) FROM tests
CROSS JOIN UNNEST(scores, questions) AS t(score, qcount)
GROUP BY student;
 student |         MAX          |       AVG
---------+----------------------+------------------
 Bob     | 4.600000000000000000 | 3.04333333333333
 Sam     | 4.900000000000000000 | 3.42222222222222
 Tom     | 4.550000000000000000 |             2.37
(3 rows)

These queries produce results for three of the five students. One student has a null value for scores and another has an empty array. These rows are not included in the function's output.

To include null and empty arrays in output, use LEFT JOIN LATERAL instead of CROSS JOIN:

=> SELECT student, MIN(score), AVG(score) FROM tests
LEFT JOIN LATERAL UNNEST(scores) AS t (score)
GROUP BY student;
 student | MIN |       AVG
---------+-----+------------------
 Bob     |  78 |               83
 Lee     |     |
 Pat     |     |
 Sam     |  85 | 93.3333333333333
 Tom     |  68 |               79
(5 rows)

The LATERAL keyword is required with LEFT JOIN. It is optional for CROSS JOIN.

Examples

Consider a table with the following definition:

=> CREATE TABLE orders (
        orderkey VARCHAR, custkey INT,
        prodkey ARRAY[VARCHAR], orderprices ARRAY[DECIMAL(12,2)],
        email_addrs ARRAY[VARCHAR]);

The following query expands one of the array columns. One of the elements is null:

=> SELECT UNNEST(orderprices) AS price, custkey, email_addrs
   FROM orders WHERE custkey='342845' ORDER BY price;
 price | custkey |       email_addrs
-------+---------+-------------------------
       |  342845 | ["br92@cs.example.edu"]
 12.00 |  342845 | ["br92@cs.example.edu"]
 22.00 |  342845 | ["br92@cs.example.edu"]
 35.00 |  342845 | ["br92@cs.example.edu"]
(4 rows)

UNNEST can expand more than one column:

=> SELECT orderkey, UNNEST(prodkey, orderprices)
   FROM orders WHERE orderkey='113-341987';
  orderkey  | val_prodkey | val_orderprices
------------+-------------+-----------------
 113-341987 | MG-7190     |           60.00
 113-341987 | MG-7190     |           67.00
 113-341987 | MG-7190     |           22.00
 113-341987 | MG-7190     |           14.99
 113-341987 | VA-4028     |           60.00
 113-341987 | VA-4028     |           67.00
 113-341987 | VA-4028     |           22.00
 113-341987 | VA-4028     |           14.99
 113-341987 | EH-1247     |           60.00
 113-341987 | EH-1247     |           67.00
 113-341987 | EH-1247     |           22.00
 113-341987 | EH-1247     |           14.99
 113-341987 | MS-7018     |           60.00
 113-341987 | MS-7018     |           67.00
 113-341987 | MS-7018     |           22.00
 113-341987 | MS-7018     |           14.99
(16 rows)

Collection functions

Notes

1 - APPLY_AVG

Behavior type

Syntax

Arguments

Null-handling

Examples

See also

2 - APPLY_COUNT (ARRAY_COUNT)

Behavior type

Syntax

Arguments

Null-handling

Examples

3 - APPLY_COUNT_ELEMENTS (ARRAY_LENGTH)

Behavior type

Syntax

Arguments

Null-handling

Examples

4 - APPLY_MAX

Behavior type

Syntax

Arguments

Null-handling

Examples

5 - APPLY_MIN

Behavior type

Syntax

Arguments

Null-handling

Examples

6 - APPLY_SUM

Behavior type

Syntax

Arguments

Null-handling

Examples

See also

7 - ARRAY_CAT

Behavior type

Syntax

Arguments

Null-handling

Examples

8 - ARRAY_CONTAINS

Deprecated

9 - ARRAY_DIMS

Behavior type

Syntax

Arguments

Examples

10 - ARRAY_FIND

Behavior type

Syntax

Arguments

Examples

Lambdas

See also

11 - CONTAINS

Behavior type

Syntax

Arguments

Examples

Lambdas

See also

12 - EXPLODE

Behavior type

Syntax

Arguments

Parameters

Null-handling

Joining on results

Examples

13 - FILTER

Behavior type

Syntax

Arguments

Examples