This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Table functions

This section contains functions for managing tables and constraints.

1: ANALYZE_CONSTRAINTS
2: ANALYZE_CORRELATIONS
3: COPY_TABLE
4: DISABLE_DUPLICATE_KEY_ERROR
5: INFER_EXTERNAL_TABLE_DDL
6: INFER_TABLE_DDL
7: LAST_INSERT_ID
8: PURGE_TABLE
9: REBALANCE_TABLE
10: REENABLE_DUPLICATE_KEY_ERROR

This section contains functions for managing tables and constraints.

See also the V_CATALOG.TABLE_CONSTRAINTS system table.

1 - ANALYZE_CONSTRAINTS

Analyzes and reports on constraint violations within the specified scope.

Analyzes and reports on constraint violations within the specified scope

You can enable automatic enforcement of primary key, unique key, and check constraints when INSERT, UPDATE, MERGE, or COPY statements execute. Alternatively, you can use ANALYZE_CONSTRAINTS to validate constraints after issuing these statements. Refer to Constraint enforcement for more information.

ANALYZE_CONSTRAINTS performs a lock in the same way that SELECT * FROM t1 holds a lock on table t1. See LOCKS for additional information.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

ANALYZE_CONSTRAINTS ('[[[database.]schema.]table ]' [, 'column[,...]'] )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table: Identifies the table to analyze. If you omit specifying a schema, Vertica uses the current schema search path. If set to an empty string, Vertica analyzes all tables in the current schema.
column: The column in table to analyze. You can specify multiple comma-delimited columns. Vertica narrows the scope of the analysis to the specified columns. If you omit specifying a column, Vertica analyzes all columns in table.

Privileges

Schema: USAGE
Table: SELECT

Detecting constraint violations during a load process

Vertica checks for constraint violations when queries are run, not when data is loaded. To detect constraint violations as part of the load process, use a COPY statement with the NO COMMIT option. By loading data without committing it, you can run a post-load check of your data using the ANALYZE_CONSTRAINTS function. If the function finds constraint violations, you can roll back the load because you have not committed it.

If ANALYZE_CONSTRAINTS finds violations, such as when you insert a duplicate value into a primary key, you can correct errors using the following functions. Effects last until the end of the session only:

Important

If a check constraint SQL expression evaluates to an unknown for a given row because a column within the expression contains a null, the row passes the constraint condition.

Return values

ANALYZE_CONSTRAINTS returns results in a structured set (see table below) that lists the schema name, table name, column name, constraint name, constraint type, and the column values that caused the violation.

If the result set is empty, then no constraint violations exist; for example:

> SELECT ANALYZE_CONSTRAINTS ('public.product_dimension', 'product_key');
Schema Name | Table Name | Column Names | Constraint Name | Constraint Type | Column Values
-------------+------------+--------------+-----------------+-----------------+---------------
(0 rows)

The following result set shows a primary key violation, along with the value that caused the violation ('10'):

=> SELECT ANALYZE_CONSTRAINTS ('');
Schema Name | Table Name | Column Names | Constraint Name | Constraint Type | Column Values
-------------+------------+--------------+-----------------+-----------------+---------------
store         t1           c1             pk_t1             PRIMARY           ('10')
(1 row)

The result set columns are described in further detail in the following table:

Column Name	Data Type	Description
`Schema Name`	VARCHAR	The name of the schema.
`Table Name`	VARCHAR	The name of the table, if specified.
`Column Names`	VARCHAR	A list of comma-delimited columns that contain constraints.
`Constraint Name`	VARCHAR	The given name of the primary key, foreign key, unique, check, or not null constraint, if specified.
`Constraint Type`	VARCHAR	Identified by one of the following strings: `PRIMARY KEY` `FOREIGN KEY` `UNIQUE` `CHECK` `NOT NULL`
`Column Values`	VARCHAR	Value of the constraint column, in the same order in which `Column Names` contains the value of that column in the violating row. When interpreted as SQL, the value of this column forms a list of values of the same type as the columns in `Column Names`; for example: `('1'),` `('1', 'z')`

Examples

See Detecting constraint violations.

2 - ANALYZE_CORRELATIONS

This function is deprecated and will be removed in a future release.

Deprecated

This function is deprecated and will be removed in a future release.

Analyzes the specified tables for pairs of columns that are strongly correlated. ANALYZE_CORRELATIONS stores the 20 pairs with the strongest correlation. ANALYZE_CORRELATIONS also analyzes statistics.

ANALYZE_CORRELATIONS analyzes only pairwise single-column correlations.

For example, state name and country name columns are strongly correlated because the city name usually, but perhaps not always, identifies the state name. The city of Conshohoken is uniquely associated with Pennsylvania, while the city of Boston exists in Georgia, Indiana, Kentucky, New York, Virginia, and Massachusetts. In this case, city name is strongly correlated with state name.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Stable

Syntax

ANALYZE_CORRELATIONS ('[[[database.]schema.]table ]' [, 'recalculate'] )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table-name: Identifies the table to analyze. If you omit specifying a schema, Vertica uses the current schema search path. If set to an empty string, Vertica analyzes all tables in the current schema.
recalculate: Boolean that specifies whether to analyze correlated columns that were previously analyzed.

Note
Column correlation analysis typically needs to be done only once.

Default:false

Privileges

One of the following:

Superuser
User with USAGE privilege on the design schema

Examples

In the following example, ANALYZE_CORRELATIONS analyzes column correlations for all tables in the public schema, even if they currently exist:

=> SELECT ANALYZE_CORRELATIONS ('public.*', 'true');
 ANALYZE_CORRELATIONS
----------------------
                    0
(1 row)

3 - COPY_TABLE

Copies one table to another.

Copies one table to another. This lightweight, in-memory function copies the DDL and all user-created projections from the source table. Projection statistics for the source table are also copied. Thus, the source and target tables initially have identical definitions and share the same storage.

Note

Although they share storage space, Vertica regards the tables as discrete objects for license capacity purposes. For example, a single-terabyte table and its copy initially consume only one TB of space. However, your Vertica license regards them as separate objects that consume two TB of space.

After the copy operation is complete, the source and copy tables are independent of each other, so you can perform DML operations on one table without impacting the other. These operations can increase the overall storage required for both tables.

Caution

If you create multiple copies of the same table concurrently, one or more of the copy operations is liable to fail. Instead, copy tables sequentially.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

COPY_TABLE (
    '[[{namespace. | database. }]schema.]source-table',
    '[[{namespace. | database. }]schema.]target-table'
)

Parameters

{ namespace. | database. }

Name of the database or namespace that contains table:

Database name: If specified, it must be the current database. In Eon Mode databases, the database name is valid only if the object is in the default namespace.
Namespace name (Eon Mode only): You must specify the namespace of objects in non-default namespaces. If no namespace is provided, Vertica assumes the object is in the default namespace.

You cannot specify both a database and namespace name.

For Eon Mode databases, the namespaces of staging-table and target-table must have the same shard count.

schema

Name of the schema, by default public. If you specify the namespace or database name, you must provide the schema name, even if the schema is public.

source-table: The source table to copy. Vertica copies all data from this table to the target table.
target-table: The target table of the source table. If the target table already exists, Vertica appends the source to the existing table.
If the table does not exist, Vertica creates a table from the source table's definition, by calling CREATE TABLE with LIKE and INCLUDING PROJECTIONS clause. The new table inherits ownership from the source table. For details, see Replicating a table.

Privileges

Non-superuser:

Source table: SELECT
Target schema/table (new): CREATE
Target table (existing): INSERT

Table attribute requirements

The following attributes of both tables must be identical:

Column definitions, including NULL/NOT NULL constraints
Segmentation
Partitioning expression
Number of projections
Projection sort order
Primary and unique key constraints. However, the key constraints do not have to be identically enabled.

Note
If the target table has primary or unique key constraints enabled and moving the partitions will insert duplicate key values into the target table, Vertica rolls back the operation. Enforcing constraints requires disk reads and can slow the copy process.
Number and definitions of text indices.
If the destination table already exists, the source and destination tables must have identical access policies.

Additionally, If access policies exist on the source table, the following must be true:

Access policies on both tables must be identical.
One of the following must be true:
- The executing user owns the source table.
- AccessPolicyManagementSuperuserOnly is set to true. See Managing access policies for details.

Table restrictions

The following restrictions apply to the source and target tables:

If the source and target partitions are in different storage tiers, Vertica returns a warning but the operation proceeds. The partitions remain in their existing storage tier.
If the source table contains a sequence, Vertica converts the sequence to an integer before copying it to the target table. If the target table contains IDENTITY or named sequence columns, Vertica cancels the copy and displays an error message.
The following tables cannot be used as sources or targets:
- Temporary tables
- Virtual tables
- System tables
- External tables

Examples

If you call COPY_TABLE and the target table does not exist, the function creates the table automatically. In the following example, COPY_TABLE creates the target table public.newtable. Vertica also copies all the constraints associated with the source table public.product_dimension except foreign key constraints:

=> SELECT COPY_TABLE ( 'public.product_dimension', 'public.newtable');
-[ RECORD 1 ]--------------------------------------------------
copy_table | Created table public.newtable.
Copied table public.product_dimension to public.newtable

4 - DISABLE_DUPLICATE_KEY_ERROR

Disables error messaging when Vertica finds duplicate primary or unique key values at run time (for use with key constraints that are not automatically enabled).

Disables error messaging when Vertica finds duplicate primary or unique key values at run time (for use with key constraints that are not automatically enabled). Queries execute as though no constraints are defined on the schema. Effects are session scoped.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

DISABLE_DUPLICATE_KEY_ERROR();

Privileges

Superuser

Examples

When you call DISABLE_DUPLICATE_KEY_ERROR, Vertica issues warnings letting you know that duplicate values will be ignored, and incorrect results are possible. DISABLE_DUPLICATE_KEY_ERROR is for use only for key constraints that are not automatically enabled.

=> select DISABLE_DUPLICATE_KEY_ERROR();
WARNING 3152:  Duplicate values in columns marked as UNIQUE will now be ignored for the remainder of your session or until reenable_duplicate_key_error() is called
WARNING 3539:  Incorrect results are possible. Please contact Vertica Support if unsure
 disable_duplicate_key_error
------------------------------
 Duplicate key error disabled
(1 row)

5 - INFER_EXTERNAL_TABLE_DDL

This function is deprecated and will be removed in a future release.

Deprecated

This function is deprecated and will be removed in a future release. Instead, use INFER_TABLE_DDL.

Inspects a file in Parquet, ORC, or Avro format and returns a CREATE EXTERNAL TABLE AS COPY statement that can be used to read the file. This statement might be incomplete. It could also contain more columns or columns with longer names than what Vertica supports; this function does not enforce Vertica system limits. Always inspect the output and address any issues before using it to create a table.

This function supports partition columns for the Parquet, ORC, and Avro formats, inferred from the input path. Because partitioning is done through the directory structure, there might not be enough information to infer the type of partition columns. In this case, this function shows these columns with a data type of UNKNOWN and emits a warning.

The function handles most data types, including complex types. If an input type is not supported in Vertica, the function emits a warning.

By default, the function uses strong typing for complex types. You can instead treat the column as a flexible complex type by setting the vertica_type_for_complex_type parameter to LONG VARBINARY.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

INFER_EXTERNAL_TABLE_DDL( path USING PARAMETERS param=value[,...] )

Arguments

path: Path to a file or directory. Any path that is valid for COPY and uses a file format supported by this function is valid.

Parameters

format: Input format (string), one of 'Parquet', 'ORC', or 'Avro'. This parameter is required.
table_name: The name of the external table to create. This parameter is required.
Do not include a schema name as part of the table name; use the table_schema parameter.
table_schema: The schema in which to create the external table. If omitted, the function does not include a schema in the output.
vertica_type_for_complex_type: Type used to represent all columns of complex types, if you do not want to expand them fully. The only supported value is LONG VARBINARY. For more information, see Flexible complex types.

Privileges

Non-superuser: READ privileges on the USER-accessible storage location.

Examples

In the following example, the input file contains data for a table with two integer columns. The table definition can be fully inferred, and you can use the returned SQL statement as-is.

=> SELECT INFER_EXTERNAL_TABLE_DDL('/data/orders/*.orc'
        USING PARAMETERS format = 'orc', table_name = 'orders');

                INFER_EXTERNAL_TABLE_DDL
--------------------------------------------------------------------------------------------------
create external table "orders" (
  "id" int,
  "quantity" int
) as copy from '/data/orders/*.orc' orc;
(1 row)

To create a table in a schema, use the table_schema parameter. Do not add it to the table name; the function treats it as a name with a period in it, not a schema.

The following example shows output with complex types. You can use the definition as-is or modify the VARCHAR sizes:

=> SELECT INFER_EXTERNAL_TABLE_DDL('/data/people/*.parquet'
        USING PARAMETERS format = 'parquet', table_name = 'employees');
WARNING 9311:  This generated statement contains one or more varchar/varbinary columns which default to length 80
                    INFER_EXTERNAL_TABLE_DDL
-------------------------------------------------------------------------
 create external table "employees"(
  "employeeID" int,
  "personal" Row(
    "name" varchar,
    "address" Row(
      "street" varchar,
      "city" varchar,
      "zipcode" int
    ),
    "taxID" int
  ),
  "department" varchar
 ) as copy from '/data/people/*.parquet' parquet;
(1 row)

In the following example, the input file contains a map in the "prods" column. You can read a map as an array of rows:

=> SELECT INFER_EXTERNAL_TABLE_DDL('/data/orders.parquet'
    USING PARAMETERS format='parquet', table_name='orders');
WARNING 9311:  This generated statement contains one or more varchar/varbinary columns which default to length 80
                INFER_EXTERNAL_TABLE_DDL
------------------------------------------------------------------------
 create external table "orders"(
  "orderkey" int,
  "custkey" int,
  "prods" Array[Row(
    "key" varchar,
    "value" numeric(12,2)
  )],
  "orderdate" date
 ) as copy from '/data/orders.parquet' parquet;
(1 row)

In the following example, the data is partitioned by region. The function was not able to infer the data type and reports UNKNOWN:

=> SELECT INFER_EXTERNAL_TABLE_DDL('/data/sales/*/*
        USING PARAMETERS format = 'parquet', table_name = 'sales');
WARNING 9262: This generated statement is incomplete because of one or more unknown column types.
Fix these data types before creating the table
                INFER_EXTERNAL_TABLE_DDL
------------------------------------------------------------------------
 create external table "sales"(
  "tx_id" int,
  "date" date,
  "region" UNKNOWN
) as copy from '/data/sales/*/*' PARTITION COLUMNS region parquet;
(1 row)

For VARCHAR and VARBINARY columns, this function does not specify a length. The Vertica default length for these types is 80 bytes. If the data values are longer, using this table definition unmodified could cause data to be truncated. Always review VARCHAR and VARBINARY columns to determine if you need to specify a length. This function emits a warning if the input file contains columns of these types:

WARNING 9311: This generated statement contains one or more varchar/varbinary columns which default to length 80

6 - INFER_TABLE_DDL

Inspects a file in Parquet, ORC, JSON, or Avro format and returns a CREATE TABLE or CREATE EXTERNAL TABLE statement based on its contents.

The returned statement might be incomplete if the input data contains ambiguous or unknown data types. It could also contain more columns or columns with longer names than what Vertica supports; this function does not enforce Vertica system limits. Always inspect the output and address any issues before using it to create a table.

This function supports partition columns, inferred from the input path. Because partitioning is done through the directory structure, there might not be enough information to infer the type of partition columns. In this case, this function shows these columns with a data type of UNKNOWN and emits a warning.

The function handles most data types, including complex types. If an input type is not supported in Vertica, the function emits a warning.

For VARCHAR and VARBINARY columns, this function does not specify a length. The Vertica default length for these types is 80 bytes. If the data values are longer, using the returned table definition unmodified could cause data to be truncated. Always review VARCHAR and VARBINARY columns to determine if you need to specify a length. This function emits a warning if the input file contains columns of these types:

WARNING 9311: This generated statement contains one or more varchar/varbinary columns which default to length 80

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

INFER_TABLE_DDL( path USING PARAMETERS param=value[,...] )

Arguments

path: Path to a file or glob. Any path that is valid for COPY and uses a file format supported by this function is valid. For all formats except JSON, if a glob specifies more than one file, this function reads a single, arbitrarily-chosen file. For JSON, the function might read more than one file. See JSON.

Parameters

format: Input format (string), one of 'Parquet', 'ORC', 'Avro', or 'JSON'. This parameter is required.
table_name: The name of the table to create. This parameter is required.
Do not include a schema name as part of the table name; use the table_schema parameter.
table_schema: The schema in which to create the table. If omitted, the function does not include a schema in the output.
table_type: The type of table to create, either 'native' or 'external'.
Default: 'native'
with_copy_statement: For native tables, whether to include a COPY statement in addition to the CREATE TABLE statement.
Default: false
one_line_result: Whether to return the DDL as a single line instead of pretty-printing. The single-line format might be easier to copy into SQL scripts.
Default: false (pretty-print)
max_files: (JSON only.) Maximum number of files in path to inspect, if path is a glob. Use this parameter to increase the amount of data the function considers, for example if you suspect variation among files. Files are chosen arbitrarily from the glob. For details, see JSON.
Default: 1
max_candidates: (JSON only.) Number of candidate table definitions to show. The function generates only one candidate per file, so if you increase max_candidates, also increase max_files. For details, see JSON.
Default: 1

Privileges

Non-superuser: READ privileges on the USER-accessible storage location.

JSON

JSON, unlike the other supported formats, does not embed a schema in data files. This function infers JSON table DDL by instead inspecting the raw data. Because raw data can be ambiguous or inconsistent, the function takes a different approach for this format.

For each input file, the function iterates through records to develop a candidate table definition. A top-level field that appears in any record is included as a column, even if not all records use it. If the same field appears in the file with different types, the function chooses a type that is consistent with all observed occurrences.

Consider a file with data about restaurants:

{
    "name" : "Pizza House",
    "cuisine" : "Italian",
    "location_city" : [],
    "chain" : true,
    "hours" : [],
    "menu" : [{"item" : "cheese pizza", "price" : 7.99},
              {"item" : "spinach pizza", "price" : 8.99},
              {"item" : "garlic bread", "price" : 4.99}]
}
{
    "name" : "Sushi World",
    "cuisine" : "Asian",
    "location_city" : ["Pittsburgh"],
    "chain" : false,
    "menu" : [{"item" : "maki platter", "price" : "21.95"},
              {"item" : "tuna roll", "price" : "4.95"}]
}

The first record contains two empty arrays, so there is not enough information to determine the element types. The second record has a string value for one of them, so the function can infer a type of VARCHAR for it. The other array element type remains unknown.

In the first record menu prices are numbers, but in the second they are strings. Both FLOAT and the string can be coerced to NUMERIC, so the function returns NUMERIC:

=> SELECT INFER_TABLE_DDL ('/data/restaurants.json'
    USING PARAMETERS table_name='restaurants', format='json');
WARNING 0:  This generated statement contains one or more varchar/varbinary types which default to length 80

                INFER_TABLE_DDL
------------------------------------------------------------------------
 Candidate matched 1 out of 1 total files:
  create table "restaurants"(
    "chain" bool,
    "cuisine" varchar,
    "hours" Array[UNKNWON],
    "location_city" Array[varchar],
    "menu" Array[Row(
      "item" varchar,
      "price" numeric
    )],
    "name" varchar
  );

(1 row)

All scalar types can be coerced to VARCHAR, so if a conflict cannot be resolved more specifically (as in the NUMERIC example), the function can still return a type. Complex types, however, cannot always be resolved in this way. In the following example, records in a file have conflicting definitions of the hours field:

{
    "name" : "Sushi World",
    "cuisine" : "Asian",
    "location_city" : ["Pittsburgh"],
    "chain" : false,
    "hours" : {"open" : "11:00", "close" : "22:00" }
}
{
    "name" : "Greasy Spoon",
    "cuisine" : "American",
    "location_city" : [],
    "chain" : "false",
    "hours" : {"open" : ["11:00","12:00"], "close" : ["21:00","22:00"] },
}

In the first record the value is a ROW with two TIME fields. In the second record the value is a ROW with two ARRAY[TIME] fields (representing weekday and weekend hours). These types are incompatible, so the function suggests a flexible complex type by using LONG VARBINARY:

=> SELECT INFER_TABLE_DDL ('/data/restaurants.json'
    USING PARAMETERS table_name='restaurants', format='json');
WARNING 0:  This generated statement contains one or more varchar/varbinary types which default to length 80

                INFER_TABLE_DDL
------------------------------------------------------------------------
 Candidate matched 1 out of 1 total files:
  create table "restaurants"(
    "chain" bool,
    "cuisine" varchar,
    "hours" long varbinary,
    "location_city" Array[varchar],
    "name" varchar
  );

(1 row)

If you call the function with a glob, by default it reads one file. Set max_files to a higher number to inspect more data. The function calculates one candidate table definition per file and returns the definition that covers the largest number of files.

Increasing the number of files does not, by itself, increase the number of candidates the function returns. With more files the function can consider more candidates, but by default it returns the single candidate that represents the largest number of files. To see more than one possible table definition, also set max_candidates. There is no benefit to setting max_candidates to a larger number than max_files.

In the following example, the glob contains two files that differ in the structure of the menu column. In the first file, the menu field has two fields:

{
    "name" : "Bob's pizzeria",
    "cuisine" : "Italian",
    "location_city" : ["Cambridge", "Pittsburgh"],
    "menu" : [{"item" : "cheese pizza", "price" : 8.25},
              {"item" : "spinach pizza", "price" : 10.50}]
}

In the second file, the menu has different offerings at different times of day:

{
    "name" : "Greasy Spoon",
    "cuisine" : "American",
    "location_city" : [],
    "menu" : [{"time" : "breakfast",
           "items" :
           [{"item" : "scrambled eggs", "price" : "3.99"}]
          },
          {"time" : "lunch",
           "items" :
           [{"item" : "grilled cheese", "price" : "3.95"},
        {"item" : "tuna melt", "price" : "5.95"},
        {"item" : "french fries", "price" : "1.99"}]}]
}

To see both candidates, raise both max_files and max_candidates:

=> SELECT INFER_TABLE_DDL ('/data/*.json'
    USING PARAMETERS table_name='restaurants', format='json',
max_files=3, max_candidates=3);
WARNING 0:  This generated statement contains one or more float types which might lose precision
WARNING 0:  This generated statement contains one or more varchar/varbinary types which default to length 80

                INFER_TABLE_DDL
------------------------------------------------------------------------
 Candidate matched 1 out of 2 total files:
  create table "restaurants"(
    "cuisine" varchar,
    "location_city" Array[varchar],
    "menu" Array[Row(
      "item" varchar,
      "price" float
    )],
    "name" varchar
  );
Candidate matched 1 out of 2 total files:
  create table "restaurants"(
    "cuisine" varchar,
    "location_city" Array[varchar],
    "menu" Array[Row(
      "items" Array[Row(
        "item" varchar,
        "price" numeric
      )],
      "time" varchar
    )],
    "name" varchar
  );

(1 row)

Examples

In the following example, the input path contains data for a table with two integer columns. The external table definition can be fully inferred, and you can use the returned SQL statement as-is. The function reads one file from the input path:

=> SELECT INFER_TABLE_DDL('/data/orders/*.orc'
    USING PARAMETERS format = 'orc', table_name = 'orders', table_type = 'external');

                INFER_TABLE_DDL
------------------------------------------------------------------------
create external table "orders" (
  "id" int,
  "quantity" int
) as copy from '/data/orders/*.orc' orc;
(1 row)

To create a table in a schema, use the table_schema parameter. Do not add it to the table name; the function treats it as a name with a period in it, not a schema.

The following example shows output with complex types. You can use the definition as-is or modify the VARCHAR sizes:

=> SELECT INFER_TABLE_DDL('/data/people/*.parquet'
    USING PARAMETERS format = 'parquet', table_name = 'employees');
WARNING 9311:  This generated statement contains one or more varchar/varbinary columns which default to length 80

                INFER_TABLE_DDL
------------------------------------------------------------------------
 create table "employees"(
  "employeeID" int,
  "personal" Row(
    "name" varchar,
    "address" Row(
      "street" varchar,
      "city" varchar,
      "zipcode" int
    ),
    "taxID" int
  ),
  "department" varchar
 );
(1 row)

In the following example, the input file contains a map in the "prods" column. You can read a map as an array of rows:

=> SELECT INFER_TABLE_DDL('/data/orders.parquet'
    USING PARAMETERS format='parquet', table_name='orders');
WARNING 9311:  This generated statement contains one or more varchar/varbinary columns which default to length 80
                INFER_TABLE_DDL
------------------------------------------------------------------------
 create table "orders"(
  "orderkey" int,
  "custkey" int,
  "prods" Array[Row(
    "key" varchar,
    "value" numeric(12,2)
  )],
  "orderdate" date
 );
(1 row)

The following example returns the definition of a native table and the COPY statement, putting the table definition on a single line to simplify cutting and pasting into a script:

=> SELECT INFER_TABLE_DDL('/data/orders/*.orc'
    USING PARAMETERS format = 'orc', table_name = 'orders',
                         table_type = 'native', with_copy_statement = true, one_line_result=true);

                INFER_TABLE_DDL
-----------------------------------------------------------------------
create table "orders" ("id" int, "quantity" int);
copy "orders" from '/data/orders/*.orc' orc;
(1 row)

In the following example, the data is partitioned by region. The function was not able to infer the data type and reports UNKNOWN:

=> SELECT INFER_TABLE_DDL('/data/sales/*/*
    USING PARAMETERS format = 'orc', table_name = 'sales', table_type = 'external');
WARNING 9262:  This generated statement is incomplete because of one or more unknown column types. Fix these data types before creating the table
WARNING 9311:  This generated statement contains one or more varchar/varbinary columns which default to length 80

                INFER_TABLE_DDL
------------------------------------------------------------------------
 create external table "sales"(
  "orderkey" int,
  "custkey" int,
  "prodkey" Array[varchar],
  "orderprices" Array[numeric(12,2)],
  "orderdate" date,
  "region" UNKNOWN
 ) as copy from '/data/sales/*/*' PARTITION COLUMNS region orc;
(1 row)

In the following example, the function reads multiple JSON files and they differ in how they represent the menu column:

=> SELECT INFER_TABLE_DDL ('/data/*.json'
    USING PARAMETERS table_name='restaurants', format='json',
max_files=3, max_candidates=3);
WARNING 0:  This generated statement contains one or more float types which might lose precision
WARNING 0:  This generated statement contains one or more varchar/varbinary types which default to length 80

                INFER_TABLE_DDL
------------------------------------------------------------------------
 Candidate matched 1 out of 2 total files:
  create table "restaurants"(
    "cuisine" varchar,
    "location_city" Array[varchar],
    "menu" Array[Row(
      "item" varchar,
      "price" float
    )],
    "name" varchar
  );
Candidate matched 1 out of 2 total files:
  create table "restaurants"(
    "cuisine" varchar,
    "location_city" Array[varchar],
    "menu" Array[Row(
      "items" Array[Row(
        "item" varchar,
        "price" numeric
      )],
      "time" varchar
    )],
    "name" varchar
  );

(1 row)

7 - LAST_INSERT_ID

Returns the last value of an IDENTITY column.

Returns the last value of an IDENTITY column. If multiple sessions concurrently load the same table with an IDENTITY column, the function returns the last value generated for that column.

Note

This function works only with IDENTITY columns. It does not work with named sequences.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

LAST_INSERT_ID()

Privileges

Table owner
USAGE privileges on the table schema

Examples

See IDENTITY sequences.

8 - PURGE_TABLE

This function was formerly named PURGE_TABLE_PROJECTIONS().

Note

This function was formerly named PURGE_TABLE_PROJECTIONS(). Vertica still supports the former function name.

Permanently removes deleted data from physical storage so disk space can be reused. You can purge historical data up to and including the Ancient History Mark epoch.

Purges all projections of the specified table. You cannot use this function to purge temporary tables.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

PURGE_TABLE ( '[[database.]schema.]table' )

Parameters

[database.]schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table: The table to purge.

Privileges

Table owner
USAGE privilege on schema

Caution

PURGE_TABLE could temporarily take up significant disk space while the data is being purged.

Examples

The following example purges all projections for the store sales fact table located in the Vmart schema:

=> SELECT PURGE_TABLE('store.store_sales_fact');

9 - REBALANCE_TABLE

Synchronously rebalances data in the specified table.

A rebalance operation performs the following tasks:

Distributes data based on:
- User-defined fault groups, if specified
- Large cluster automatic fault groups
Redistributes database projection data across all nodes.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

REBALANCE_TABLE('[[database.]schema.]table-name')

Parameters

schema: Database and schema. The default schema is public. If you specify a database, it must be the current database.
table-name: The table to rebalance.

Privileges

Superuser

When to rebalance

Rebalancing is useful or even necessary after you perform the following tasks:

Mark one or more nodes as ephemeral in preparation of removing them from the cluster.
Add one or more nodes to the cluster so that Vertica can populate the empty nodes with data.
Change the scaling factor of an elastic cluster, which determines the number of storage containers used to store a projection across the database.
Set the control node size or realign control nodes on a large cluster layout
Add nodes to or remove nodes from a fault group.

Tip

By default, before performing a rebalance, Vertica queries system tables to compute the size of all projections involved in the rebalance task. This query can add significant overhead to the rebalance operation. To disable this query, set projection configuration parameter RebalanceQueryStorageContainers to 0.

Examples

The following command shows how to rebalance data on the specified table.

=> SELECT REBALANCE_TABLE('online_sales.online_sales_fact');
REBALANCE_TABLE
-------------------
 REBALANCED
(1 row)

10 - REENABLE_DUPLICATE_KEY_ERROR

Restores the default behavior of error reporting by reversing the effects of DISABLE_DUPLICATE_KEY_ERROR.

Restores the default behavior of error reporting by reversing the effects of DISABLE_DUPLICATE_KEY_ERROR. Effects are session-scoped.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

REENABLE_DUPLICATE_KEY_ERROR();

Privileges

Superuser

Examples

=> SELECT REENABLE_DUPLICATE_KEY_ERROR();
 REENABLE_DUPLICATE_KEY_ERROR
------------------------------
 Duplicate key error enabled
(1 row)