EXPLODE

Expands the elements of one or more collection columns (ARRAY or SET) into individual table rows, one row per element.

Expands the elements of one or more collection columns (ARRAY or SET) into individual table rows, one row per element. For each exploded collection, the results include two columns, one for the element index, and one for the value at that position. If the function explodes a single collection, these columns are named position and value by default. If the function explodes two or more collections, the columns for each collection are named pos_column-name and val_column-name. You can use an AS clause in the SELECT to change these column names.

EXPLODE is similar to UNNEST, which returns values but not positions.

By default, EXPLODE requires an OVER clause. If you set the skip_partitioning parameter to true, an OVER clause is not required and is ignored if present.

Behavior type

Immutable

Syntax

EXPLODE (column[,...] [USING PARAMETERS param=value])
[ OVER ( [window-partition-clause] ) ]

Arguments

column
Column in the table being queried. You must specify at least as many collection columns as the value of the explode_count parameter. Columns that are not collections are passed through without modification.

Passthrough columns are not needed if skip_partitioning is true.

OVER(...)
How to partition and sort input data. The input data is the result set that the query returns after it evaluates FROM, WHERE, GROUP BY, and HAVING clauses. For EXPLODE, use OVER() or OVER(PARTITION BEST).

This clause is ignored if skip_partitioning is true.

Parameters

explode_count
The number of collection columns to explode. The function checks each column, up to this value, and either explodes it if is a collection or passes it through if it is not a collection or if this limit has been reached. If the value of explode_count is greater than the number of collection columns specified, the function returns an error.

Default: 1

skip_partitioning
Whether to skip partitioning and ignore the OVER clause if present. EXPLODE translates a single row of input into multiple rows of output, one per collection element. There is, therefore, usually no benefit to partitioning the input first. Skipping partitioning can help a query avoid an expensive sort or merge operation. Even so, setting this parameter can negatively affect performance in rare cases.

Default: false

Null-handling

This function expands each element in a collection into a row, including null elements. If the input column is NULL or an empty collection, the function produces no rows for that column.

Joining on results

You can use the output of EXPLODE as if it were a relation by using CROSS JOIN EXPLODE in a query. This operation is not a full Cartesian product because it ignores null or empty arrays. CROSS JOIN EXPLODE requires that you set the skip_partitioning parameter, either in the EXPLODE call or as a session parameter.

Consider the following table of students and exam scores:

=> SELECT * FROM tests;
 student |    scores     |    questions
---------+---------------+-----------------
 Bob     | [92,78,79]    | [20,20,100]
 Sam     | [97,98,85]    | [20,20,100]
 Tom     | [68,75,82,91] | [20,20,100,100]
(3 rows)

The following query finds the best test scores across all students:

=> ALTER SESSION SET UDPARAMETER FOR ComplexTypesLib skip_partitioning = true;

=> SELECT student, score FROM tests
   CROSS JOIN EXPLODE(scores) AS t (pos, score)
   ORDER BY score DESC;
 student | score
---------+-------
 Sam     |    98
 Sam     |    97
 Bob     |    92
 Tom     |    91
 Sam     |    85
 Tom     |    82
 Bob     |    79
 Bob     |    78
 Tom     |    75
 Tom     |    68
(10 rows)

The following query returns maximum and average per-question scores, considering both the exam score and the number of questions:

=> SELECT student, MAX(score/qcount), AVG(score/qcount) FROM tests
   CROSS JOIN EXPLODE(scores, questions USING PARAMETERS explode_count=2)
      AS t(pos_s, score, pos_q, qcount)
   GROUP BY student;
 student |         MAX          |       AVG
---------+----------------------+------------------
 Bob     | 4.600000000000000000 | 3.04333333333333
 Sam     | 4.900000000000000000 | 3.42222222222222
 Tom     | 4.550000000000000000 |             2.37
(3 rows)

Examples

Consider an orders table with the following contents:

=> SELECT orderkey, custkey, prodkey, orderprices, email_addrs
   FROM orders LIMIT 5;
  orderkey  | custkey |                    prodkey                    |            orderprices            |                                                  email_addrs
------------+---------+-----------------------------------------------+-----------------------------------+----------------------------------------------------------------------------------------------------------------
 113-341987 |  342799 | ["MG-7190 ","VA-4028 ","EH-1247 ","MS-7018 "] | ["60.00","67.00","22.00","14.99"] | ["bob@example,com","robert.jones@example.com"]
 111-952000 |  342845 | ["ID-2586 ","IC-9010 ","MH-2401 ","JC-1905 "] | ["22.00","35.00",null,"12.00"]    | ["br92@cs.example.edu"]
 111-345634 |  342536 | ["RS-0731 ","SJ-2021 "]                       | ["50.00",null]                    | [null]
 113-965086 |  342176 | ["GW-1808 "]                                  | ["108.00"]                        | ["joe.smith@example.com"]
 111-335121 |  342321 | ["TF-3556 "]                                  | ["50.00"]                         | ["789123@example-isp.com","alexjohnson@example.com","monica@eng.example.com","sara@johnson.example.name",null]
(5 rows)

The following query explodes the order prices for a single customer. The other two columns are passed through and are repeated for each returned row:

=> SELECT EXPLODE(orderprices, custkey, email_addrs
                  USING PARAMETERS skip_partitioning=true)
            AS (position, orderprices, custkey, email_addrs)
   FROM orders WHERE custkey='342845' ORDER BY orderprices;
 position | orderprices | custkey |         email_addrs
----------+-------------+---------+------------------------------
        2 |             |  342845 | ["br92@cs.example.edu",null]
        3 |       12.00 |  342845 | ["br92@cs.example.edu",null]
        0 |       22.00 |  342845 | ["br92@cs.example.edu",null]
        1 |       35.00 |  342845 | ["br92@cs.example.edu",null]
(4 rows)

The previous example uses the skip_partitioning parameter. Instead of setting it for each call to EXPLODE, you can set it as a session parameter. EXPLODE is part of the ComplexTypesLib UDx library. The following example returns the same results:

=> ALTER SESSION SET UDPARAMETER FOR ComplexTypesLib skip_partitioning=true;

=> SELECT EXPLODE(orderprices, custkey, email_addrs)
            AS (position, orderprices, custkey, email_addrs)
   FROM orders WHERE custkey='342845' ORDER BY orderprices;

You can explode more than one column by specifying the explode_count parameter:

=> SELECT EXPLODE(orderkey, prodkey, orderprices
                  USING PARAMETERS explode_count=2, skip_partitioning=true)
          AS (orderkey,pk_idx,pk_val,ord_idx,ord_val)
   FROM orders
   WHERE orderkey='113-341987';
  orderkey  | pk_idx |  pk_val  | ord_idx | ord_val
------------+--------+----------+---------+---------
 113-341987 |      0 | MG-7190  |       0 |   60.00
 113-341987 |      0 | MG-7190  |       1 |   67.00
 113-341987 |      0 | MG-7190  |       2 |   22.00
 113-341987 |      0 | MG-7190  |       3 |   14.99
 113-341987 |      1 | VA-4028  |       0 |   60.00
 113-341987 |      1 | VA-4028  |       1 |   67.00
 113-341987 |      1 | VA-4028  |       2 |   22.00
 113-341987 |      1 | VA-4028  |       3 |   14.99
 113-341987 |      2 | EH-1247  |       0 |   60.00
 113-341987 |      2 | EH-1247  |       1 |   67.00
 113-341987 |      2 | EH-1247  |       2 |   22.00
 113-341987 |      2 | EH-1247  |       3 |   14.99
 113-341987 |      3 | MS-7018  |       0 |   60.00
 113-341987 |      3 | MS-7018  |       1 |   67.00
 113-341987 |      3 | MS-7018  |       2 |   22.00
 113-341987 |      3 | MS-7018  |       3 |   14.99
(16 rows)

The following example uses a multi-dimensional array:

=> SELECT name, pingtimes FROM network_tests;
 name |                       pingtimes
------+-------------------------------------------------------
 eng1 | [[24.24,25.27,27.16,24.97],[23.97,25.01,28.12,29.5]]
 eng2 | [[27.12,27.91,28.11,26.95],[29.01,28.99,30.11,31.56]]
 qa1  | [[23.15,25.11,24.63,23.91],[22.85,22.86,23.91,31.52]]
(3 rows)

=> SELECT EXPLODE(name, pingtimes USING PARAMETERS explode_count=1) OVER()
   FROM network_tests;
 name | position |           value
------+----------+---------------------------
 eng1 |        0 | [24.24,25.27,27.16,24.97]
 eng1 |        1 | [23.97,25.01,28.12,29.5]
 eng2 |        0 | [27.12,27.91,28.11,26.95]
 eng2 |        1 | [29.01,28.99,30.11,31.56]
 qa1  |        0 | [23.15,25.11,24.63,23.91]
 qa1  |        1 | [22.85,22.86,23.91,31.52]
(6 rows)

You can rewrite the previous query as follows to produce the same results:

=> SELECT name, EXPLODE(pingtimes USING PARAMETERS skip_partitioning=true)
   FROM network_tests;