SYNC_WITH_HCATALOG_SCHEMA_TABLE

Copies the structure of a single table in a Hive database schema available through the HCatalog Connector to a Vertica table.

Copies the structure of a single table in a Hive database schema available through the HCatalog Connector to a Vertica table.

This function can synchronize the HCatalog schema directly. In this case, call it with the same schema name for the vertica_schema and hcatalog_schema parameters. The function can also synchronize a different schema to the HCatalog schema.

This is a meta-function. You must call meta-functions in a top-level SELECT statement.

Behavior type

Volatile

Syntax

SYNC_WITH_HCATALOG_SCHEMA_TABLE( vertica_schema, hcatalog_schema, table_name )

Parameters

vertica_schema
The existing Vertica schema to store the copied HCatalog schema's metadata. This can be the same schema as hcatalog_schema, or it can be a separate one created with CREATE SCHEMA.
hcatalog_schema
The HCatalog schema to copy, created with CREATE HCATALOG SCHEMA.
table_name
The table in hcatalog_schema to copy. If table_name already exists in vertica_schema, the function overwrites it.

Privileges

Non-superuser: CREATE privileges on vertica_schema.

Users also require access to Hive data, one of the following:

  • USAGE permissions on hcat_schema, if Hive does not use an authorization service to manage access.

  • Permission through an authorization service (Sentry or Ranger), and access to the underlying files in HDFS. (Sentry can provide that access through ACL synchronization.)

  • dbadmin user privileges, with or without an authorization service.

Data type matching

Hive STRING and BINARY data types are matched, in Vertica, to the VARCHAR(65000) and VARBINARY(65000) types. Adjust the data types with ALTER TABLE as needed after creating the schema. The maximum size of a VARCHAR or VARBINARY in Vertica is 65000, but you can use LONG VARCHAR and LONG VARBINARY to specify larger values.

Hive and Vertica define string length in different ways. In Hive the length is the number of characters; in Vertica it is the number of bytes. Thus, a character encoding that uses more than one byte, such as Unicode, can cause mismatches between the two. To avoid data truncation, set values in Vertica based on bytes, not characters.

If data size exceeds the column size, Vertica logs an event at read time in the QUERY_EVENTS system table.

Examples

The following example uses SYNC_WITH_HCATALOG_SCHEMA_TABLE to synchronize the "nation" table:

=> CREATE SCHEMA 'hcat_local';
CREATE SCHEMA

=> CREATE HCATALOG SCHEMA hcat WITH hostname='hcathost' HCATALOG_SCHEMA='hcat'
   HCATALOG_USER='hcatuser';
CREATE SCHEMA

=> SELECT sync_with_hcatalog_schema_table('hcat_local', 'hcat', 'nation');
sync_with_hcatalog_schema_table
-----------------------------------------------------------------------------
    Schema hcat_local synchronized with hcat for table nation
    table nation is created in schema hcat_local
    (1 row)

The following example shows the behavior if the "nation" table already exists in the local schema:

=> SELECT sync_with_hcatalog_schema_table('hcat_local','hcat','nation');
sync_with_hcatalog_schema_table
-----------------------------------------------------------------------------
    Schema hcat_local synchronized with hcat for table nation
    table nation is altered in schema hcat_local
    (1 row)