查看 Hive 架构和表元数据

使用 Hive 时,您可以通过执行用 HiveQL(Hive 版本的 SQL)编写的语句(如 SHOW TABLES),访问架构和表元数据。使用 HCatalog 连接器时,您可以通过几个 Vertica 系统表,获取 Hive 数据库中表的元数据。

您可以通过 HCatalog 连接器访问四个包含表元数据的系统表:

  • HCATALOG_SCHEMATA 列出使用 HCatalog 连接器定义的所有架构。

  • HCATALOG_TABLE_LIST 包含使用 HCatalog 连接器定义的所有架构中所有可用表的概览。此表仅显示查询表的用户可以访问的表。针对使用 HCatalog 连接器定义的每个架构均调用一次 HiveServer2 即可检索此表中的信息,这意味着查询此表时的开销非常小。

  • HCATALOG_TABLES 包含比 HCATALOG_TABLE_LIST 更深入的信息。

  • HCATALOG_COLUMNS 列出通过 HCatalog 连接器获得的所有表中所有列的元数据。至于 HCATALOG_TABLES,查询此表会导致对每个表均调用一次 HiveServer2,因此可能需要花费一段时间才能完成。

下面的示例演示了,如何查询包含通过 HCatalog 连接器获取的表元数据的系统表。

=> CREATE HCATALOG SCHEMA hcat WITH hostname='hcathost'
-> HCATALOG_SCHEMA='default' HCATALOG_DB='default' HCATALOG_USER='hcatuser';
CREATE SCHEMA
=> SELECT * FROM HCATALOG_SCHEMATA;
-[ RECORD 1 ]--------+-----------------------------
schema_id            | 45035996273864536
schema_name          | hcat
schema_owner_id      | 45035996273704962
schema_owner         | dbadmin
create_time          | 2013-11-05 10:19:54.70965-05
hostname             | hcathost
port                 | 9083
webservice_hostname  | hcathost
webservice_port      | 50111
hcatalog_schema_name | default
hcatalog_user_name   | hcatuser
metastore_db_name    | hivemetastoredb

=> SELECT * FROM HCATALOG_TABLE_LIST;
-[ RECORD 1 ]------+------------------
table_schema_id    | 45035996273864536
table_schema       | hcat
hcatalog_schema    | default
table_name         | hcatalogtypes
hcatalog_user_name | hcatuser
-[ RECORD 2 ]------+------------------
table_schema_id    | 45035996273864536
table_schema       | hcat
hcatalog_schema    | default
table_name         | tweets
hcatalog_user_name | hcatuser
-[ RECORD 3 ]------+------------------
table_schema_id    | 45035996273864536
table_schema       | hcat
hcatalog_schema    | default
table_name         | messages
hcatalog_user_name | hcatuser
-[ RECORD 4 ]------+------------------
table_schema_id    | 45035996273864536
table_schema       | hcat
hcatalog_schema    | default
table_name         | msgjson
hcatalog_user_name | hcatuser

=> -- Get detailed description of a specific table
=> SELECT * FROM HCATALOG_TABLES WHERE table_name = 'msgjson';
-[ RECORD 1 ]---------+-----------------------------------------------------------
table_schema_id       | 45035996273864536
table_schema          | hcat
hcatalog_schema       | default
table_name            | msgjson
hcatalog_user_name    | hcatuser
min_file_size_bytes   |
total_number_files    | 10
location              | hdfs://hive.example.com:8020/user/exampleuser/msgjson
last_update_time      |
output_format         | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
last_access_time      |
max_file_size_bytes   |
is_partitioned        | f
partition_expression  |
table_owner           |
input_format          | org.apache.hadoop.mapred.TextInputFormat
total_file_size_bytes | 453534
hcatalog_group        |
permission            |

=> -- Get list of columns in a specific table
=> SELECT * FROM HCATALOG_COLUMNS WHERE table_name = 'hcatalogtypes'
-> ORDER BY ordinal_position;
-[ RECORD 1 ]------------+-----------------
table_schema             | hcat
hcatalog_schema          | default
table_name               | hcatalogtypes
is_partition_column      | f
column_name              | intcol
hcatalog_data_type       | int
data_type                | int
data_type_id             | 6
data_type_length         | 8
character_maximum_length |
numeric_precision        |
numeric_scale            |
datetime_precision       |
interval_precision       |
ordinal_position         | 1
-[ RECORD 2 ]------------+-----------------
table_schema             | hcat
hcatalog_schema          | default
table_name               | hcatalogtypes
is_partition_column      | f
column_name              | floatcol
hcatalog_data_type       | float
data_type                | float
data_type_id             | 7
data_type_length         | 8
character_maximum_length |
numeric_precision        |
numeric_scale            |
datetime_precision       |
interval_precision       |
ordinal_position         | 2
-[ RECORD 3 ]------------+-----------------
table_schema             | hcat
hcatalog_schema          | default
table_name               | hcatalogtypes
is_partition_column      | f
column_name              | doublecol
hcatalog_data_type       | double
data_type                | float
data_type_id             | 7
data_type_length         | 8
character_maximum_length |
numeric_precision        |
numeric_scale            |
datetime_precision       |
interval_precision       |
ordinal_position         | 3
-[ RECORD 4 ]------------+-----------------
table_schema             | hcat
hcatalog_schema          | default
table_name               | hcatalogtypes
is_partition_column      | f
column_name              | charcol
hcatalog_data_type       | string
data_type                | varchar(65000)
data_type_id             | 9
data_type_length         | 65000
character_maximum_length | 65000
numeric_precision        |
numeric_scale            |
datetime_precision       |
interval_precision       |
ordinal_position         | 4
-[ RECORD 5 ]------------+-----------------
table_schema             | hcat
hcatalog_schema          | default
table_name               | hcatalogtypes
is_partition_column      | f
column_name              | varcharcol
hcatalog_data_type       | string
data_type                | varchar(65000)
data_type_id             | 9
data_type_length         | 65000
character_maximum_length | 65000
numeric_precision        |
numeric_scale            |
datetime_precision       |
interval_precision       |
ordinal_position         | 5
-[ RECORD 6 ]------------+-----------------
table_schema             | hcat
hcatalog_schema          | default
table_name               | hcatalogtypes
is_partition_column      | f
column_name              | boolcol
hcatalog_data_type       | boolean
data_type                | boolean
data_type_id             | 5
data_type_length         | 1
character_maximum_length |
numeric_precision        |
numeric_scale            |
datetime_precision       |
interval_precision       |
ordinal_position         | 6
-[ RECORD 7 ]------------+-----------------
table_schema             | hcat
hcatalog_schema          | default
table_name               | hcatalogtypes
is_partition_column      | f
column_name              | timestampcol
hcatalog_data_type       | string
data_type                | varchar(65000)
data_type_id             | 9
data_type_length         | 65000
character_maximum_length | 65000
numeric_precision        |
numeric_scale            |
datetime_precision       |
interval_precision       |
ordinal_position         | 7
-[ RECORD 8 ]------------+-----------------
table_schema             | hcat
hcatalog_schema          | default
table_name               | hcatalogtypes
is_partition_column      | f
column_name              | varbincol
hcatalog_data_type       | binary
data_type                | varbinary(65000)
data_type_id             | 17
data_type_length         | 65000
character_maximum_length | 65000
numeric_precision        |
numeric_scale            |
datetime_precision       |
interval_precision       |
ordinal_position         | 8
-[ RECORD 9 ]------------+-----------------
table_schema             | hcat
hcatalog_schema          | default
table_name               | hcatalogtypes
is_partition_column      | f
column_name              | bincol
hcatalog_data_type       | binary
data_type                | varbinary(65000)
data_type_id             | 17
data_type_length         | 65000
character_maximum_length | 65000
numeric_precision        |
numeric_scale            |
datetime_precision       |
interval_precision       |
ordinal_position         | 9