Using the HCatalog Connector
The Vertica HCatalog Connector lets you access data stored in Apache's Hive data warehouse software the same way you access it within a native Vertica table.
If your files are in the Optimized Columnar Row (ORC) or Parquet format and do not use complex types, the HCatalog Connector creates an external table and uses the ORC or Parquet reader instead of using the Java SerDe. See ORC (parser) and PARQUET (parser) for more information about these readers.
The HCatalog Connector performs predicate pushdown to improve query performance. Instead of reading all data across the network to evaluate a query, the HCatalog Connector moves the evaluation of predicates closer to the data. Predicate pushdown applies to Hive partition pruning, ORC stripe pruning, and Parquet row-group pruning. The HCatalog Connector supports predicate pushdown for the following predicates: >, >=, =, <>, <=, <.
In this section
- Overview
- How the HCatalog Connector works
- HCatalog Connector requirements
- Installing the Java runtime on your Vertica cluster
- Configuring Vertica for HCatalog
- Configuring security
- Defining a schema using the HCatalog Connector
- Querying Hive tables using HCatalog Connector
- Viewing Hive schema and table metadata
- Synchronizing an HCatalog schema or table with a local schema or table
- Data type conversions from Hive to Vertica
- Using nonstandard SerDes
- Troubleshooting HCatalog Connector problems