Complex types

Tables can have columns of complex types, including nested complex types.

Tables can have columns of complex types, including nested complex types. You can use the ROW (struct), ARRAY, and SET types in native and external tables, including flex tables. Sets are limited to one-dimensional collections of scalar types. A limited MAP type is available for external tables, but you can use ARRAY and ROW to express a map instead. Selected parsers support loading data with complex types.

You can define a column for heterogeneous combinations of the ARRAY and ROW types: a struct containing array fields or an array of structs. These types can be nested up to the maximum nesting depth of 100.

Restrictions for native tables

Complex types used in native tables have some restrictions, in addition to the restrictions for individual types listed on their reference pages:

  • A native table must have at least one column that is a primitive type or a native array (one-dimensional array of a primitive type). If a flex table has real columns, it must also have at least one column satisfying this restriction.

  • Complex type columns cannot be used in ORDER BY or PARTITION BY clauses nor as FILLER columns.

  • Complex type columns cannot have constraints.

  • Expressions returning complex types cannot be used as projection columns, and projections cannot be segmented or ordered by columns of complex types.

  • Tables with columns of complex types cannot use DEFAULT and SET USING.

See CREATE TABLE and ALTER TABLE for additional restrictions.

Deriving a table definition from the data

You can use the INFER_TABLE_DDL function to inspect Parquet, ORC, or Avro data and produce a starting point for a table definition. This function returns a CREATE TABLE statement, which might require further editing. For columns where the function could not infer the data type, the function labels the type as unknown and emits a warning. For VARCHAR and VARBINARY columns, you might need to adjust the length. Always review the statement the function returns, but especially for tables with many columns, using this function can save time and effort.

In the following example, the data contains VARCHAR fields and columns and warns that you might need to adjust their lengths:

=> SELECT INFER_TABLE_DDL('/data/people/*.parquet'
   USING PARAMETERS format = 'parquet', table_name = 'employees', table_type = 'native');
WARNING 9311:  This generated statement contains one or more varchar/varbinary columns which default to length 80
                                 INFER_TABLE_DDL
------------------------------------------------------------------------------------------------------------------
 create table "employees"(
  "employeeID" int,
  "personal" Row(
    "name" varchar,
    "address" Row(
      "street" varchar,
      "city" varchar,
      "zipcode" int
    ),
    "taxID" int
  ),
  "department" varchar
 );
(1 row)

Alternative to strong typing

Though you can fully specify a column representing any combination of ROW and ARRAY types, there might be cases where you prefer a more flexible approach. If the data contains a struct with hundreds of fields, only a few of which you need, you might prefer to extract just those few at query time instead of defining all of the fields. Similarly, if the data structure is likely to change, you might prefer to defer fully specifying the complex types. You can use flexible columns as an alternative to fully specifying the structure of a complex column. This is the same approach used for flex tables, where all data is initially loaded into a single binary column and materialized from there as needed. See Flexible complex types for more information about using this approach.