Table types

Vertica has several types of tables.

Vertica has several types of tables. During initial exploration, your focus should be on whatever is most expedient for you; you will modify or replace the tables you build at this stage before you put them in production. As you get closer to production-ready definitions, pay more attention to the specific considerations for each table type.

Native (ROS) tables

Native tables store their data in ROS format in the database. After you have finalized table definitions, you can optimize the performance of your most important or frequent queries by creating projections.

If you are exploring data that is in multiple locations, using a native table for exploration can be easier than using external tables. With a native table, you can load data from many sources into the database. If multiple people are exploring the same data together, you do not need to grant additional permissions for an external file system or object store. You only need one person to have permission to read the files being loaded.

On the other hand, if you make a change to a native table definition that is not compatible with existing data, the change fails:

=> SELECT name FROM catalog;
                name
------------------------------------
 4k 48in TV
 home theatre system
 professional kitchen appliance set
(3 rows)

=> ALTER TABLE catalog ALTER COLUMN name SET DATA TYPE VARCHAR(10);
ROLLBACK 2378:  Cannot convert column "name" to type "varchar(10)"
HINT:  Verify that the data in the column conforms to the new type

To make an incompatible change, you can create a new table and then use INSERT with SELECT to transform and copy existing data to the new table. For example, you can cast existing data to new types:

=> CREATE TABLE catalog2(
  id INT, msrp NUMERIC(8,2), name VARCHAR(10),
  vendors ARRAY[VARCHAR(50), 2],
  prices ARRAY[NUMERIC(10,2), 2] );

--- cannot directly insert:
=> INSERT INTO catalog2 SELECT * FROM catalog;
ERROR 4800:  String of 19 octets is too long for type Varchar(10)

-- violating array bounds is also an error:
=> INSERT INTO catalog2
SELECT id, msrp, name::VARCHAR(10), vendors, prices FROM catalog;
ERROR 9227:  Output array isn't big enough
DETAIL:  Type limit is 2 elements, but value has 3 elements

--- cast to truncate:
=> INSERT INTO catalog2
SELECT id, msrp, name::VARCHAR(10),
       vendors::ARRAY[VARCHAR(50), 2],
       prices::ARRAY[NUMERIC(10,2), 2] FROM catalog;
 OUTPUT
--------
      3
(1 row)

=> SELECT * FROM catalog2;
  id  |   msrp   |    name    |             vendors             |        prices

------+----------+------------+---------------------------------+-----------------------
 1064 |   329.99 | 4k 48in TV | ["Amazon","Best Buy"]           | ["299.99","289.99"]
 1183 | 11999.99 | home theat | ["Amazon","Bob's Hardware"]     | ["8949.99","9999.99"]
 1271 |  8999.99 | profession | ["Amazon","Designer Solutions"] | ["8299.99","8999.99"]
(3 rows)

External tables

With an external table, Vertica reads data at query time instead of storing it in the database. External tables are a good choice when you do not need the performance of native tables, for example for older data or partitions that you query less frequently.

External tables are particularly useful when the external data is growing or changing, because you do not need to reload data yourself. However, this means you discover structural changes such as changed column types or new columns at query time rather than at load time.

You can drop an external table and recreate it to change its definition. Because the database does not store the data, replacing a table is not a destructive operation. During data exploration, you can treat external tables as lightweight and disposable.

Because the data is read at query time, you must ensure that your users have and retain permission to read the data in its original location. Depending on where the data is stored, you might need to take additional steps to manage access, such as creating AWS IAM roles on S3.

For more information, see How external tables differ from native tables.

Flex tables

Flex tables are designed for data with an unknown or varying schema. They are a good choice for exploring data with varying fields or columns. JSON and delimited formats, in particular, do not embed a schema, so these formats are particularly prone to variation.

You can use flex tables as either native tables (CREATE FLEX TABLE) or external tables (CREATE FLEX EXTERNAL TABLE).

Flex tables trade flexibility for reduced performance and are not a good choice for production environments. You can use flex tables to explore data and then create standard (non-flex) native or external tables to deploy.