This section contains functions to manage interactions with Hadoop.
This is the multi-page printable view of this section. Click here to print.
Hadoop functions
- 1: CLEAR_HDFS_CACHES
- 2: EXTERNAL_CONFIG_CHECK
- 3: GET_METADATA
- 4: HADOOP_IMPERSONATION_CONFIG_CHECK
- 5: HASH_EXTERNAL_TOKEN
- 6: HCATALOGCONNECTOR_CONFIG_CHECK
- 7: HDFS_CLUSTER_CONFIG_CHECK
- 8: KERBEROS_HDFS_CONFIG_CHECK
- 9: SYNC_WITH_HCATALOG_SCHEMA
- 10: SYNC_WITH_HCATALOG_SCHEMA_TABLE
- 11: VERIFY_HADOOP_CONF_DIR
1 - CLEAR_HDFS_CACHES
Clears the configuration information copied from HDFS and any cached connections.
This function affects reads using the hdfs
scheme in the following ways:
-
This function flushes information loaded from configuration files copied from Hadoop (such as core-site.xml). These files are found on the path set by the HadoopConfDir configuration parameter.
-
This function flushes information about which NameNode is active in a High Availability (HA) Hadoop cluster. Therefore, the first request to Hadoop after calling this function is slower than expected.
Vertica maintains a cache of open connections to NameNodes to reduce latency. This function flushes that cache.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
CLEAR_HDFS_CACHES ( )
Privileges
Superuser
Examples
The following example clears the Hadoop configuration information:
=> SELECT CLEAR_HDFS_CACHES();
CLEAR_HDFS_CACHES
--------------
Cleared
(1 row)
See also
Apache Hadoop parameters2 - EXTERNAL_CONFIG_CHECK
Tests the Hadoop configuration of a Vertica cluster. This function tests HDFS configuration files, HCatalog Connector configuration, and Kerberos configuration.
This function calls the following functions:
- KERBEROS_CONFIG_CHECK
- HADOOP_IMPERSONATION_CONFIG_CHECK
- HDFS_CLUSTER_CONFIG_CHECK
- HCATALOGCONNECTOR_CONFIG_CHECK
If you call this function with an argument, it passes the argument to functions it calls that also take an argument.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
EXTERNAL_CONFIG_CHECK( ['what_to_test' ] )
Arguments
what_to_test
- A string specifying the authorities, nameservices, and/or HCatalog schemas to test. The format is a comma-separated list of "key=value" pairs, where keys are "authority", "nameservice", and "schema". The value is passed to all of the sub-functions; see those reference pages for details on how values are interpreted.
Privileges
This function does not require privileges.
Examples
The following example tests the configuration of only the nameservice named "ns1". Output has been omitted due to length.
=> SELECT EXTERNAL_CONFIG_CHECK('nameservice=ns1');
3 - GET_METADATA
Returns the metadata of a Parquet file. Metadata includes the number and sizes of row groups, column names, and information about chunks and compression. Metadata is returned as JSON.
This function inspects one file. Parquet data usually spans many files in a single directory; choose one. The function does not accept a directory name as an argument.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
GET_METADATA( 'filename' )
Arguments
filename
- The name of a Parquet file. Any path that is valid for COPY is valid for this function. This function does not operate on files in other formats.
Privileges
Superuser, or non-superuser with READ privileges on the USER-accessible storage location (see GRANT (storage location)).
Examples
You must call this function with a single file, not a directory or glob:
=> SELECT GET_METADATA('/data/emp-row.parquet');
GET_METADATA
----------------------------------------------------------------------------------------------------
schema:
required group field_id=-1 spark_schema {
optional int32 field_id=-1 employeeID;
optional group field_id=-1 personal {
optional binary field_id=-1 name (String);
optional group field_id=-1 address {
optional binary field_id=-1 street (String);
optional binary field_id=-1 city (String);
optional int32 field_id=-1 zipcode;
}
optional int32 field_id=-1 taxID;
}
optional binary field_id=-1 department (String);
}
data page version:
data page v1
metadata:
{
"FileName": "/data/emp-row.parquet",
"FileFormat": "Parquet",
"Version": "1.0",
"CreatedBy": "parquet-mr version 1.10.1 (build a89df8f9932b6ef6633d06069e50c9b7970bebd1)",
"TotalRows": "4",
"NumberOfRowGroups": "1",
"NumberOfRealColumns": "3",
"NumberOfColumns": "7",
"Columns": [
{ "Id": "0", "Name": "employeeID", "PhysicalType": "INT32", "ConvertedType": "NONE", "LogicalType": {"Type": "None"} },
{ "Id": "1", "Name": "personal.name", "PhysicalType": "BYTE_ARRAY", "ConvertedType": "UTF8", "LogicalType": {"Type": "String"} },
{ "Id": "2", "Name": "personal.address.street", "PhysicalType": "BYTE_ARRAY", "ConvertedType": "UTF8", "LogicalType": {"Type": "String"} },
{ "Id": "3", "Name": "personal.address.city", "PhysicalType": "BYTE_ARRAY", "ConvertedType": "UTF8", "LogicalType": {"Type": "String"} },
{ "Id": "4", "Name": "personal.address.zipcode", "PhysicalType": "INT32", "ConvertedType": "NONE", "LogicalType": {"Type": "None"} },
{ "Id": "5", "Name": "personal.taxID", "PhysicalType": "INT32", "ConvertedType": "NONE", "LogicalType": {"Type": "None"} },
{ "Id": "6", "Name": "department", "PhysicalType": "BYTE_ARRAY", "ConvertedType": "UTF8", "LogicalType": {"Type": "String"} }
],
"RowGroups": [
{
"Id": "0", "TotalBytes": "642", "TotalCompressedBytes": "0", "Rows": "4",
"ColumnChunks": [
{"Id": "0", "Values": "4", "StatsSet": "True", "Stats": {"NumNulls": "0", "DistinctValues": "0", "Max": "51513", "Min": "17103" },
"Compression": "SNAPPY", "Encodings": "PLAIN RLE BIT_PACKED ", "UncompressedSize": "67", "CompressedSize": "69" },
{"Id": "1", "Values": "4", "StatsSet": "True", "Stats": {"NumNulls": "0", "DistinctValues": "0", "Max": "Sheldon Cooper", "Min": "Howard Wolowitz" },
"Compression": "SNAPPY", "Encodings": "PLAIN RLE BIT_PACKED ", "UncompressedSize": "142", "CompressedSize": "145" },
{"Id": "2", "Values": "4", "StatsSet": "True", "Stats": {"NumNulls": "0", "DistinctValues": "0", "Max": "52 Broad St", "Min": "100 Main St Apt 4A" },
"Compression": "SNAPPY", "Encodings": "PLAIN RLE BIT_PACKED ", "UncompressedSize": "139", "CompressedSize": "123" },
{"Id": "3", "Values": "4", "StatsSet": "True", "Stats": {"NumNulls": "0", "DistinctValues": "0", "Max": "Pasadena", "Min": "Pasadena" },
"Compression": "SNAPPY", "Encodings": "RLE PLAIN_DICTIONARY BIT_PACKED ", "UncompressedSize": "95", "CompressedSize": "99" },
{"Id": "4", "Values": "4", "StatsSet": "True", "Stats": {"NumNulls": "0", "DistinctValues": "0", "Max": "91021", "Min": "91001" },
"Compression": "SNAPPY", "Encodings": "PLAIN RLE BIT_PACKED ", "UncompressedSize": "68", "CompressedSize": "70" },
{"Id": "5", "Values": "4", "StatsSet": "True", "Stats": {"NumNulls": "4", "DistinctValues": "0", "Max": "0", "Min": "0" },
"Compression": "SNAPPY", "Encodings": "PLAIN RLE BIT_PACKED ", "UncompressedSize": "28", "CompressedSize": "30" },
{"Id": "6", "Values": "4", "StatsSet": "True", "Stats": {"NumNulls": "0", "DistinctValues": "0", "Max": "Physics", "Min": "Astronomy" },
"Compression": "SNAPPY", "Encodings": "RLE PLAIN_DICTIONARY BIT_PACKED ", "UncompressedSize": "103", "CompressedSize": "107" }
]
}
]
}
(1 row)
4 - HADOOP_IMPERSONATION_CONFIG_CHECK
Reports the delegation tokens Vertica will use when accessing Kerberized data in HDFS. The HadoopImpersonationConfig configuration parameter specifies one or more authorities, nameservices, and HCatalog schemas and their associated tokens. For each tested value, the function reports what doAs user or delegation token Vertica will use for access. Use this function to confirm that you have defined your delegation tokens as you intended.
You can call this function with an argument to specify the authority, nameservice, or HCatalog schema to test, or without arguments to test all configured values.
This function does not check that you can use these delegation tokens to access HDFS.
See Proxy users and delegation tokens for more about impersonation.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
HADOOP_IMPERSONATION_CONFIG_CHECK( ['what_to_test' ] )
Arguments
what_to_test
- A string specifying the authorities, nameservices, and/or HCatalog schemas to test. For example, a value of 'nameservice=ns1' means the function tests only access to the nameservice "ns1" and ignores any other authorities and schemas. A value of 'nameservice=ns1, schema=hcat1' means the function tests one nameservice and one HCatalog schema.
If you do not specify this argument, the function tests all authorities, nameservices, and schemas defined in HadoopImpersonationConfig .
Privileges
This function does not require privileges.
Examples
Consider the following definition of HadoopImpersonationConfig:
[{
"nameservice": "ns1",
"token": "RANDOM-TOKEN-STRING"
},
{
"nameservice": "*",
"doAs": "Paul"
},
{
"schema": "hcat1",
"doAs": "Fred"
}
]
The following query tests only the "ns1" name service:
=> SELECT HADOOP_IMPERSONATION_CONFIG_CHECK('nameservice=ns1');
-- hadoop_impersonation_config_check --
Connections to nameservice [ns1] will use a delegation token with hash [b3dd9e71cd695d91]
This function returns a hash of the token for security reasons. You can call HASH_EXTERNAL_TOKEN with the expected value and compare that hash to the one in this function's output.
A query with no argument tests all values:
=> SELECT HADOOP_IMPERSONATION_CONFIG_CHECK();
-- hadoop_impersonation_config_check --
Connections to nameservice [ns1] will use a delegation token with hash [b3dd9e71cd695d91]
JDBC connections for HCatalog schema [hcat1] will doAs [Fred]
[!] hadoop_impersonation_config_check : [PASS]
5 - HASH_EXTERNAL_TOKEN
Returns a hash of a string token, for use with HADOOP_IMPERSONATION_CONFIG_CHECK. Call HASH_EXTERNAL_TOKEN
with the delegation token you expect Vertica to use and compare it to the hash in the output of HADOOP_IMPERSONATION_CONFIG_CHECK
.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
HASH_EXTERNAL_TOKEN( 'token' )
Arguments
token
- A string specifying the token to hash. The token is configured in the HadoopImpersonationConfig parameter.
Privileges
This function does not require privileges.
Examples
The following query tests the expected value shown in the example on the HADOOP_IMPERSONATION_CONFIG_CHECK reference page.
=> SELECT HASH_EXTERNAL_TOKEN('RANDOM-TOKEN-STRING');
hash_external_token
---------------------
b3dd9e71cd695d91
(1 row)
6 - HCATALOGCONNECTOR_CONFIG_CHECK
Tests the configuration of a Vertica cluster that uses the HCatalog Connector to access Hive data. The function first verifies that the HCatalog Connector is properly installed and reports on the values of several related configuration parameters. It then tests the connection using HiveServer2. This function does not support the WebHCat server.
If you specify an HCatalog schema, and if you have defined a delegation token for that schema, this function uses the delegation token. Otherwise, the function uses the default endpoint without a delegation token.
See Proxy users and delegation tokens for more about delegation tokens.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
HCATALOGCONNECTOR_CONFIG_CHECK( ['what_to_test' ] )
Arguments
what_to_test
- A string specifying the HCatalog schemas to test. For example, a value of 'schema=hcat1' means the function tests only the "hcat1" schema and ignores any others that are found.
Privileges
This function does not require privileges.
Examples
The following query tests with the default endpoint and no delegation token.
=> SELECT HCATALOGCONNECTOR_CONFIG_CHECK();
-- hcatalogconnector_config_check --
HCatalogConnectorUseHiveServer2 : [1]
EnableHCatImpersonation : [1]
HCatalogConnectorUseORCReader : [1]
HCatalogConnectorUseParquetReader : [1]
HCatalogConnectorUseTxtReader : [0]
[INFO] Vertica is not configured to use its internal parsers for delimited files.
[INFO] This is off by default, but will be changed in a future release.
HCatalogConnectorUseLibHDFSPP : [1]
[OK] HCatalog connector library is properly installed.
[INFO] Creating JDBC connection as session user.
[OK] Successful JDBC connection to HiveServer2 as user [USER].
[!] hcatalogconnector_config_check : [PASS]
To test with the configured delegation token, pass the schema as an argument:
=> SELECT HCATALOGCONNECTOR_CONFIG_CHECK('schema=hcat1');
7 - HDFS_CLUSTER_CONFIG_CHECK
Tests the configuration of a Vertica cluster that uses HDFS. The function scans the Hadoop configuration files found in HadoopConfDir and performs configuration checks on each cluster it finds. If you have more than one cluster configured, you can specify which one to test instead of testing all of them.
For each Hadoop cluster, it reports properties including:
-
Nameservice name and associated NameNodes
-
High-availability status
-
RPC encryption status
-
Kerberos authentication status
-
HTTP(S) status
It then tests connections using http(s)
, hdfs
, and webhdfs
URL schemes. It tests the latter two using both the Vertica and session user.
See Configuring HDFS access for information about configuration files and HadoopConfDir.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
HDFS_CLUSTER_CONFIG_CHECK( ['what_to_test' ] )
Arguments
what_to_test
- A string specifying the authorities or nameservices to test. For example, a value of 'nameservice=ns1' means the function tests only "ns1" cluster. If you specify both an authority and a nameservice, the authority must be a NameNode in the specified nameservice for the check to pass.
If you do not specify this argument, the function tests all cluster configurations found in HadoopConfDir.
Privileges
This function does not require privileges.
Examples
The following example tests all clusters.
=> SELECT HDFS_CLUSTER_CONFIG_CHECK();
-- hdfs_cluster_config_check --
Hadoop Conf Path : [/conf/hadoop_conf]
[OK] HadoopConfDir verified on all nodes
Connection Timeout (seconds) : [60]
Token Refresh Frequency (seconds) : [0]
HadoopFSBlockSizeBytes (MiB) : [64]
[OK] Found [1] hadoop cluster configurations
------------- Cluster 1 -------------
Is DefaultFS : [true]
Nameservice : [vmns]
Namenodes : [node1.example.com:8020, node2.example.com:8020]
High Availability : [true]
RPC Encryption : [false]
Kerberos Authentication : [true]
HTTPS Only : [false]
[INFO] Checking connections to [hdfs:///]
vertica : [OK]
dbuser : [OK]
[INFO] Checking connections to [http://node1.example.com:50070]
[INFO] Node is in standby
[INFO] Checking connections to [http://node2.example.com:50070]
[OK] Can make authenticated external curl connection
[INFO] Checking webhdfs
vertica : [OK]
USER : [OK]
[!] hdfs_cluster_config_check : [PASS]
8 - KERBEROS_HDFS_CONFIG_CHECK
Deprecated
This function is deprecated and will be removed in a future release. Instead, use EXTERNAL_CONFIG_CHECK.Tests the Kerberos configuration of a Vertica cluster that uses HDFS. The function succeeds if it can use both the Vertica keytab file and the session user to access HDFS, and reports errors otherwise. This function is a more specific version of KERBEROS_CONFIG_CHECK.
If the current session is not Kerberized, this function will not be able to use secured HDFS connections and will fail.
You can call this function with arguments to specify an HDFS configuration to test, or without arguments. If you call it with no arguments, this function reads the HDFS configuration files and fails if it does not find them. See Configuring HDFS access. If it finds configuration files, it tests all configured nameservices.
The function performs the following tests, in order:
-
Are Kerberos services available?
-
Does a keytab file exist and are the Kerberos and HDFS configuration parameters set in the database?
-
Can Vertica read and invoke kinit with the keys to authenticate to HDFS and obtain the database Kerberos ticket?
-
Can Vertica perform
hdfs
andwebhdfs
operations using both the database Kerberos ticket and user-forwardable tickets for the current session? -
Can Vertica connect to HiveServer2? (This function does not support WebHCat.)
If any test fails, the function returns a descriptive error message.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
KERBEROS_HDFS_CONFIG_CHECK( ['hdfsHost:hdfsPort',
'webhdfsHost:webhdfsPort', 'webhcatHost' ] )
Arguments
hdfsHost, hdfsPort
- The hostname or IP address and port of the HDFS NameNode. Vertica uses this server to access data that is specified with
hdfs
URLs. If the value is ' ', the function skips this part of the check. webhdfsHost, webhdfsPort
- The hostname or IP address and port of the WebHDFS server. Vertica uses this server to access data that is specified with
webhdfs
URLs. If the value is ' ', the function skips this part of the check. webhcatHost
- Pass any value in this position. WebHCat is deprecated and this value is ignored but must be present.
Privileges
This function does not require privileges.
9 - SYNC_WITH_HCATALOG_SCHEMA
Copies the structure of a Hive database schema available through the HCatalog Connector to a Vertica schema. If the HCatalog schema and the target Vertica schema have matching table names, SYNC_WITH_HCATALOG_SCHEMA overwrites the Vertica tables.
This function can synchronize the HCatalog schema directly. In this case, call it with the same schema name for the vertica_schema
and hcatalog_schema
parameters. The function can also synchronize a different schema to the HCatalog schema.
If you change the settings of HCatalog Connector configuration parameters, you must call this function again.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SYNC_WITH_HCATALOG_SCHEMA( vertica_schema, hcatalog_schema, [drop_non_existent] )
Parameters
vertica_schema
- The target Vertica schema to store the copied HCatalog schema's metadata. This can be the same schema as
hcatalog_schema
, or it can be a separate one created with CREATE SCHEMA.Caution
Do not use the Vertica schema to store other data. hcatalog_schema
- The HCatalog schema to copy, created with CREATE HCATALOG SCHEMA
drop_non_existent
- If
true
, drop any tables invertica_schema
that do not correspond to a table inhcatalog_schema
Privileges
Non-superuser: CREATE privileges on vertica_schema
.
Users also require access to Hive data, one of the following:
-
USAGE permissions on
hcat_schema
, if Hive does not use an authorization service to manage access. -
Permission through an authorization service (Sentry or Ranger), and access to the underlying files in HDFS. (Sentry can provide that access through ACL synchronization.)
-
dbadmin user privileges, with or without an authorization service.
Data type matching
Hive STRING and BINARY data types are matched, in Vertica, to the VARCHAR(65000) and VARBINARY(65000) types. Adjust the data types with ALTER TABLE as needed after creating the schema. The maximum size of a VARCHAR or VARBINARY in Vertica is 65000, but you can use LONG VARCHAR and LONG VARBINARY to specify larger values.
Hive and Vertica define string length in different ways. In Hive the length is the number of characters; in Vertica it is the number of bytes. Thus, a character encoding that uses more than one byte, such as Unicode, can cause mismatches between the two. To avoid data truncation, set values in Vertica based on bytes, not characters.
If data size exceeds the column size, Vertica logs an event at read time in the QUERY_EVENTS system table.
Examples
The following example uses SYNC_WITH_HCATALOG_SCHEMA to synchronize an HCatalog schema named hcat:
=> CREATE HCATALOG SCHEMA hcat WITH hostname='hcathost' HCATALOG_SCHEMA='default'
HCATALOG_USER='hcatuser';
CREATE SCHEMA
=> SELECT sync_with_hcatalog_schema('hcat', 'hcat');
sync_with_hcatalog_schema
----------------------------------------
Schema hcat synchronized with hcat
tables in hcat = 56
tables altered in hcat = 0
tables created in hcat = 56
stale tables in hcat = 0
table changes erred in hcat = 0
(1 row)
=> -- Use vsql's \d command to describe a table in the synced schema
=> \d hcat.messages
List of Fields by Tables
Schema | Table | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key
-----------+----------+---------+----------------+-------+---------+----------+-------------+-------------
hcat | messages | id | int | 8 | | f | f |
hcat | messages | userid | varchar(65000) | 65000 | | f | f |
hcat | messages | "time" | varchar(65000) | 65000 | | f | f |
hcat | messages | message | varchar(65000) | 65000 | | f | f |
(4 rows)
The following example uses SYNC_WITH_HCATALOG_SCHEMA followed by ALTER TABLE to adjust a column value:
=> CREATE HCATALOG SCHEMA hcat WITH hostname='hcathost' HCATALOG_SCHEMA='default'
-> HCATALOG_USER='hcatuser';
CREATE SCHEMA
=> SELECT sync_with_hcatalog_schema('hcat', 'hcat');
...
=> ALTER TABLE hcat.t ALTER COLUMN a1 SET DATA TYPE long varchar(1000000);
=> ALTER TABLE hcat.t ALTER COLUMN a2 SET DATA TYPE long varbinary(1000000);
The following example uses SYNC_WITH_HCATALOG_SCHEMA with a local (non-HCatalog) schema:
=> CREATE HCATALOG SCHEMA hcat WITH hostname='hcathost' HCATALOG_SCHEMA='default'
-> HCATALOG_USER='hcatuser';
CREATE SCHEMA
=> CREATE SCHEMA hcat_local;
CREATE SCHEMA
=> SELECT sync_with_hcatalog_schema('hcat_local', 'hcat');
10 - SYNC_WITH_HCATALOG_SCHEMA_TABLE
Copies the structure of a single table in a Hive database schema available through the HCatalog Connector to a Vertica table.
This function can synchronize the HCatalog schema directly. In this case, call it with the same schema name for the vertica_schema
and hcatalog_schema
parameters. The function can also synchronize a different schema to the HCatalog schema.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
SYNC_WITH_HCATALOG_SCHEMA_TABLE( vertica_schema, hcatalog_schema, table_name )
Parameters
vertica_schema
- The existing Vertica schema to store the copied HCatalog schema's metadata. This can be the same schema as
hcatalog_schema
, or it can be a separate one created with CREATE SCHEMA. hcatalog_schema
- The HCatalog schema to copy, created with CREATE HCATALOG SCHEMA.
table_name
- The table in
hcatalog_schema
to copy. Iftable_name
already exists invertica_schema
, the function overwrites it.
Privileges
Non-superuser: CREATE privileges on vertica_schema
.
Users also require access to Hive data, one of the following:
-
USAGE permissions on
hcat_schema
, if Hive does not use an authorization service to manage access. -
Permission through an authorization service (Sentry or Ranger), and access to the underlying files in HDFS. (Sentry can provide that access through ACL synchronization.)
-
dbadmin user privileges, with or without an authorization service.
Data type matching
Hive STRING and BINARY data types are matched, in Vertica, to the VARCHAR(65000) and VARBINARY(65000) types. Adjust the data types with ALTER TABLE as needed after creating the schema. The maximum size of a VARCHAR or VARBINARY in Vertica is 65000, but you can use LONG VARCHAR and LONG VARBINARY to specify larger values.
Hive and Vertica define string length in different ways. In Hive the length is the number of characters; in Vertica it is the number of bytes. Thus, a character encoding that uses more than one byte, such as Unicode, can cause mismatches between the two. To avoid data truncation, set values in Vertica based on bytes, not characters.
If data size exceeds the column size, Vertica logs an event at read time in the QUERY_EVENTS system table.
Examples
The following example uses SYNC_WITH_HCATALOG_SCHEMA_TABLE to synchronize the "nation" table:
=> CREATE SCHEMA 'hcat_local';
CREATE SCHEMA
=> CREATE HCATALOG SCHEMA hcat WITH hostname='hcathost' HCATALOG_SCHEMA='hcat'
HCATALOG_USER='hcatuser';
CREATE SCHEMA
=> SELECT sync_with_hcatalog_schema_table('hcat_local', 'hcat', 'nation');
sync_with_hcatalog_schema_table
-----------------------------------------------------------------------------
Schema hcat_local synchronized with hcat for table nation
table nation is created in schema hcat_local
(1 row)
The following example shows the behavior if the "nation" table already exists in the local schema:
=> SELECT sync_with_hcatalog_schema_table('hcat_local','hcat','nation');
sync_with_hcatalog_schema_table
-----------------------------------------------------------------------------
Schema hcat_local synchronized with hcat for table nation
table nation is altered in schema hcat_local
(1 row)
11 - VERIFY_HADOOP_CONF_DIR
Verifies that the Hadoop configuration that is used to access HDFS is valid on all Vertica nodes. The configuration is valid if:
-
all required configuration files are found on the path defined by the HadoopConfDir configuration parameter
-
all properties needed by Vertica are set in those files
This function does not attempt to validate the settings of those properties; it only verifies that they have values.
It is possible for Hadoop configuration to be valid on some nodes and invalid on others. The function reports a validation failure if the value is invalid on any node; the rest of the output reports the details.
This is a meta-function. You must call meta-functions in a top-level SELECT statement.
Behavior type
VolatileSyntax
VERIFY_HADOOP_CONF_DIR( )
Parameters
This function has no parameters.
Privileges
This function does not require privileges.
Examples
The following example shows the results when the Hadoop configuration is valid.
=> SELECT VERIFY_HADOOP_CONF_DIR();
verify_hadoop_conf_dir
-------------------------------------------------------------------
Validation Success
v_vmart_node0001: HadoopConfDir [PG_TESTOUT/config] is valid
v_vmart_node0002: HadoopConfDir [PG_TESTOUT/config] is valid
v_vmart_node0003: HadoopConfDir [PG_TESTOUT/config] is valid
v_vmart_node0004: HadoopConfDir [PG_TESTOUT/config] is valid
(1 row)
In the following example, the Hadoop configuration is valid on one node, but on other nodes a needed value is missing.
=> SELECT VERIFY_HADOOP_CONF_DIR();
verify_hadoop_conf_dir
-------------------------------------------------------------------
Validation Failure
v_vmart_node0001: HadoopConfDir [PG_TESTOUT/test_configs/config] is valid
v_vmart_node0002: No fs.defaultFS parameter found in config files in [PG_TESTOUT/config]
v_vmart_node0003: No fs.defaultFS parameter found in config files in [PG_TESTOUT/config]
v_vmart_node0004: No fs.defaultFS parameter found in config files in [PG_TESTOUT/config]
(1 row)