Materializing flex tables
After creating flex tables, you can change the table structure to promote virtual columns to materialized (real) columns. If your table is already a hybrid table, you can change existing real columns and promote other important virtual columns. This section describes some key aspects of promoting columns, adding columns, specifying constraints, and declaring default values. It also presents some differences when loading flex or hybrid tables, compared with columnar tables.
Note
Materializing virtual columns by promoting them to real columns can significantly improve query performance. Vertica recommends that you materialize important virtual columns before running large and complex queries. Promoted columns cause a small decrease in load performance.Adding columns to flex tables
Add columns to your flex tables to promote virtual columns:
-
Add a real column with the same name as a virtual column (
user.name
):=> ALTER TABLE darkdata1 ADD COLUMN "user.name" VARCHAR; ALTER TABLE
-
Load some data into the table.
=> COPY darkdata1 FROM '/vertica/flextable/DATA/tweets_12.json' PARSER fjsonparser(); Rows Loaded ------------- 12 (1 row)
-
Query the materialized column. Notice that loading data populates the column automatically. Blank rows indicate no values or NULLs:
=> SELECT "user.name" FROM darkdata1; user.name --------------------- I'm Toaster⥠Flu Beach seydo shi The End Uptown gentleman. ~G A B R I E L A ⿠Avita Desai laughing at clouds. (12 rows)
Adding columns with default values
The section Bulk loading data into flex tables describes the use of default values, and how Vertica evaluates them during loading. As with all tables, using COPY to load data ignores any column default values.
Note
Adding a table column default expression to a flex table requires casting the column to an explicit data type.-
Create a
darkdata1
table with a column definition. The following example uses a column name (talker
) that does not correspond to a virtual column name. Assign a default value with a virtual column name. In this example, the default value for the columntalker
is ("user.lang
"). Sinceuser.lang
is a virtual column in theLONG VARBINARY __raw__
column, you must cast its value toVARCHAR
to match thetalker
column definition:=> CREATE FLEX TABLE darkdata1(talker VARCHAR default "user.lang"::VARCHAR); CREATE TABLE
-
Load some JSON data, specifying the
__raw__
column:=> COPY darkdata1 (__raw__) FROM '/test/vertica/flextable/DATA/tweets_12.json' PARSER fjsonparser(); Rows Loaded ------------- 12 (1 row)
-
Query the
talker
column. Notice that Vertica used the default column value ("user.lang
"), because you specified__raw__
. Blank rows indicate no values or NULLs:=> SELECT "talker" FROM darkdata1; talker -------- it en es en en es tr en (12 rows)
-
Alter the table to add a column with a known virtual column name (
user.name
), assigning the key name as the default value (recommended), and casting it to a VARCHAR:=> ALTER TABLE darkdata1 ADD COLUMN "user.name" VARCHAR default "user.name"::VARCHAR; ALTER TABLE
-
Load data again, this time without
__raw__
:=> COPY darkdata1 FROM '/test/vertica/flextable/DATA/tweets_12.json' PARSER fjsonparser();
-
Query the two real columns. Notice that
talker
has no values, because you did not specifiy the__raw__
column. Theuser.lang
column contains values from theuser.name
virtual column:=> SELECT "talker", "user.name" FROM darkdata1; talker | user.name --------+--------------------- | laughing at clouds. | Avita Desai | I'm Toaster⥠| | | | Uptown gentleman. | ~G A B R I E L A ⿠| Flu Beach | | seydo shi | The End (12 rows)
-
Load data once more, this time specifying a COPY statement with a default value expression for
user.name
:=> COPY darkdata1 (__raw__, "user.name" as 'QueenElizabeth'::varchar) FROM '/test/vertica/flextable/DATA/tweets_12.json' PARSER fjsonparser(); Rows Loaded ------------- 12 (1 row)
-
Query once more. Notice that the real column
talker
has its default values (you used__raw__
). As specified inCOPY
, the"user.name" as 'QueenElizabeth'
expression overrode theuser.name
default column value:=> SELECT "talker", "user.name" FROM darkdata1; talker | user.name --------+---------------- it | QueenElizabeth en | QueenElizabeth es | QueenElizabeth | QueenElizabeth | QueenElizabeth | QueenElizabeth en | QueenElizabeth en | QueenElizabeth es | QueenElizabeth | QueenElizabeth tr | QueenElizabeth en | QueenElizabeth (12 rows)
To summarize, you can set a default column value as part of the ALTER TABLE...ADD COLUMN
operation. For materializing columns, the default value should reference the key name of the virtual column (as in "user.lang"
). Subsequently loading data with a COPY
value expression overrides the default value of the column definition.
Changing the __raw__ column size
You can change the default size of the __raw__
column for flex tables you plan to create, the current size of an existing flex table, or both.
To change the default size for the __raw__
column in new flex tables, set the FlexTableRawSize configuration parameter:
=> ALTER DATABASE DEFAULT SET FlexTableRawSize = 120000;
Changing the configuration parameter affects all flex tables you create after making this change.
To change the size of the _raw_
column in an existing flex table, use the ALTER TABLE statement to change the definition of the __raw__ column:
=> ALTER TABLE tester ALTER COLUMN __raw__ SET DATA TYPE LONG VARBINARY(120000);
ALTER TABLE
Note
An error occurs if you try to reduce the__raw__
column size to a value smaller than any data the column contains.
Changing flex table real columns
You can make the following changes to the flex table real columns (__raw__
and __identity__
), but not to any virtual columns:
Actions | raw | identity |
---|---|---|
Change NOT NULL constraints (default) | Yes | Yes |
Add primary key and foreign key (PK/FK) constraints | No | Yes |
Create projections | No | Yes |
Segment | No | Yes |
Partition | No | Yes |
Specify a user-defined scalar function (UDSF) as a default column expression in ALTER TABLE x ADD COLUMN y statement |
No | No |
Note
While segmenting and partitioning the__raw__
column is permitted, it is not recommended due to its long data type. By default, if you not define any real columns, flex tables are segmented on the __identity__
column.
Dropping flex table columns
There are two considerations about dropping columns:
-
You cannot drop the last column in your flex table's sort order.
-
If you have not created a flex table with any real columns, or materialized any columns, you cannot drop the
__identity__
column.
Updating flex table views
Adding new columns to a flex table that has an associated view does not update the view's result set, even if the view uses a wildcard (*) to represent all table columns. To incorporate new columns, you must re-create the view.