Monitoring external tables
Vertica records information about external tables in system tables. You can use these tables to track your external data and queries against it.
The TABLES system table contains data about all tables, both native and external. The TABLE_DEFINITION column is specific to external tables. You can query this column to see all external data sources currently in use, as in the following example:
=> SELECT table_name, create_time, table_definition FROM tables WHERE table_definition != '';
table_name | create_time | table_definition
---------------+-------------------------------+-----------------------------------------------------------
customers_orc | 2018-03-21 11:07:30.159442-04 | COPY from '/home/dbadmin/sample_orc_files/0*' ORC
miscprod | 2018-06-26 17:40:04.012121-04 | copy from '/home/dbadmin/data/prod.csv'
students | 2018-06-26 17:46:50.695024-04 | copy from '/home/dbadmin/students.csv'
numbers | 2018-06-26 17:53:52.407441-04 | copy from '/home/dbadmin/tt.dat'
catalog | 2018-06-26 18:12:28.598519-04 | copy from '/home/dbadmin/data/prod.csv' delimiter ','
inventory | 2018-06-26 18:13:06.951802-04 | copy from '/home/dbadmin/data/stores.csv' delimiter ','
test | 2018-06-27 16:31:39.170866-04 | copy from '/home/dbadmin/data/stores.csv' delimiter ','
(7 rows)
The EXTERNAL_TABLE_DETAILS table provides more details, including file sizes. Vertica computes the values in this table at query time, which is potentially expensive, so consider restricting the query by schema or table.
=> SELECT table_name, source_format, total_file_size_bytes FROM external_table_details;
table_name | source_format | total_file_size_bytes
---------------+---------------+-----------------------
customers_orc | ORC | 619080883
miscprod | DELIMITED | 254
students | DELIMITED | 763
numbers | DELIMITED | 30
catalog | DELIMITED | 254
inventory | DELIMITED | 74
test | DELIMITED | 74
(7 rows)
If the size of an external table changes significantly over time, you should rerun ANALYZE_EXTERNAL_ROW_COUNT() to gather updated statistics. See Improving Query Performance for External Tables.
The LOAD_SOURCES table shows information for loads currently in progress. This table does not record information about loads of ORC or Parquet data.