Structs

Columns can contain structs, which store property-value pairs.

Columns can contain structs, which store property-value pairs. For example, a struct representing an address could have strings for the street address and city/state and an integer for the postal code:

{ "street":"150 Cambridgepark Dr.",
  "city":"Cambridge MA",
  "postalcode":02140}

Struct fields can be primitive types or other complex types.

Use the ROW expression to define a struct column. In the following example, the data has columns for customer name, address, and account number, and the address is a struct in the data. The types you declare in Vertica must be compatible with the types in the data you load into them.

=> CREATE TABLE customers (
    name VARCHAR,
    address ROW(street VARCHAR, city VARCHAR, zipcode INT),
    accountID INT);

Within the ROW, you specify the fields and their data types using the same syntax as for columns. Vertica treats the ROW as a single column for purposes of queries.

Structs can contain other structs. In the following example, employees have various personal information, including an address which is itself a struct.

=> CREATE TABLE employees(
    employeeID INT,
    personal ROW(
      name VARCHAR,
      address ROW(street VARCHAR, city VARCHAR, zipcode INT),
      taxID INT),
    department VARCHAR);

Structs can contain arrays of primitive types, arrays, or structs.

=> CREATE TABLE customers(
  name VARCHAR,
  contact ROW(
    street VARCHAR,
    city VARCHAR,
    zipcode INT,
    email ARRAY[VARCHAR]
  ),
  accountid INT );

When defining an external table with Parquet or ORC data, Vertica requires the definition of the table to match the schema of the data. For example, with the data used in the previous employees example, the following definition is an error:

=> CREATE EXTERNAL TABLE employees(
    employeeID INT,
    personal ROW(
      name VARCHAR,
      address ROW(street VARCHAR, city VARCHAR),
      zipcode INT,
      taxID INT),
    department VARCHAR)
  AS COPY FROM '...' PARQUET;
ERROR 9151: Datatype mismatch [...]

The data contains an address struct with three fields (street, city, zipcode), so the external table must also use a ROW with three fields. Changing the ROW to have two fields and promoting one of the fields to the parent ROW is a mismatch. Each ROW must match and, if structs are nested in the data, the complete structure must match.

For native tables, you can specify which columns to load from the data, so you do not need to account for all of them. For the columns you load, the definition of the table must match the schema in the data file. Some parsers report fields found in that data that are not part of the table definition.

Handling nulls

If a struct exists but a field value is null, Vertica assigns NULL as its value in the ROW. A struct where all fields are null is treated as a ROW with null fields. If the struct itself is null, Vertica reads the ROW as NULL.

Queries

See Rows (structs).

Restrictions

ROW columns have several restrictions:

  • Maximum nesting depth is 100.
  • Vertica tables support up to 9800 columns and fields. The ROW itself is not counted, only its fields.
  • ROW columns cannot use any constraints (such as NOT NULL) or defaults.
  • ROW fields cannot be auto_increment or setof.
  • ROW definition must include at least one field.
  • Row is a reserved keyword within a ROW definition, but is permitted as the name of a table or column.
  • Tables containing ROW columns cannot also contain IDENTITY, default, SET USING, or named sequence columns.