After you define a table, you can use
ALTER TABLE
to modify existing table columns. You can perform the following operations on a column:
This is the multi-page printable view of this section. Click here to print.
Managing table columns
1 - Renaming columns
You rename a column with ALTER TABLE
as follows:
ALTER TABLE [schema.]table-name RENAME [ COLUMN ] column-name TO new-column-name
The following example renames a column in the Retail.Product_Dimension
table from Product_description
to Item_description
:
=> ALTER TABLE Retail.Product_Dimension
RENAME COLUMN Product_description TO Item_description;
If you rename a column that is referenced by a view, the column does not appear in the result set of the view even if the view uses the wild card (*) to represent all columns in the table. Recreate the view to incorporate the column's new name.
2 - Changing column data type
In general, you can change a column's data type with ALTER TABLE if doing so does not require storage reorganization. After you modify a column's data type, data that you load conforms to the new definition.
The sections that follow describe requirements and restrictions associated with changing a column's data type.
Supported data type conversions
Vertica supports conversion for the following data types:
Data Types | Supported Conversions |
---|---|
Binary | Expansion and contraction. |
Character | All conversions between CHAR, VARCHAR, and LONG VARCHAR. |
Exact numeric |
All conversions between the following numeric data types: integer data types—INTEGER, INT, BIGINT, TINYINT, INT8, SMALLINT—and NUMERIC values of scale <=18 and precision 0. You cannot modify the scale of NUMERIC data types; however, you can change precision in the ranges (0-18), (19-37), and so on. |
Collection |
The following conversions are supported:
For details, see Changing Collection Columns. |
Unsupported data type conversions
Vertica does not allow data type conversion on types that require storage reorganization:
-
Boolean
-
DATE/TIME
-
Approximate numeric type
-
BINARY to VARBINARY and vice versa
You also cannot change a column's data type if the column is one of the following:
-
Primary key
-
Foreign key
-
Included in the SEGMENTED BY clause of any projection for that table.
-
Complex type column. One exception applies: in external tables, you can change a primitive column type to a complex type.
You can work around some of these restrictions. For details, see Working with column data conversions.
2.1 - Changing column width
You can expand columns within the same class of data type. Doing so is useful for storing larger items in a column. Vertica validates the data before it performs the conversion.
In general, you can also reduce column widths within the data type class. This is useful to reclaim storage if the original declaration was longer than you need, particularly with strings. You can reduce column width only if the following conditions are true:
-
Existing column data is no greater than the new width.
-
All nodes in the database cluster are up.
Otherwise, Vertica returns an error and the conversion fails. For example, if you try to convert a column from varchar(25)
to varchar(10)
Vertica allows the conversion as long as all column data is no more than 10 characters.
In the following example, columns y
and z
are initially defined as VARCHAR data types, and loaded with values 12345
and 654321
, respectively. The attempt to reduce column z
's width to 5 fails because it contains six-character data. The attempt to reduce column y
's width to 5 succeeds because its content conforms with the new width:
=> CREATE TABLE t (x int, y VARCHAR, z VARCHAR);
CREATE TABLE
=> CREATE PROJECTION t_p1 AS SELECT * FROM t SEGMENTED BY hash(x) ALL NODES;
CREATE PROJECTION
=> INSERT INTO t values(1,'12345','654321');
OUTPUT
--------
1
(1 row)
=> SELECT * FROM t;
x | y | z
---+-------+--------
1 | 12345 | 654321
(1 row)
=> ALTER TABLE t ALTER COLUMN z SET DATA TYPE char(5);
ROLLBACK 2378: Cannot convert column "z" to type "char(5)"
HINT: Verify that the data in the column conforms to the new type
=> ALTER TABLE t ALTER COLUMN y SET DATA TYPE char(5);
ALTER TABLE
Changing collection columns
If a column is a collection data type, you can use ALTER TABLE to change either its bounds or its maximum binary size. These properties are set at table creation time and can then be altered.
You can make a collection bounded, setting its maximum number of elements, as in the following example.
=> ALTER TABLE test.t1 ALTER COLUMN arr SET DATA TYPE array[int,10];
ALTER TABLE
=> \d test.t1
List of Fields by Tables
Schema | Table | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key
--------+-------+--------+-----------------+------+---------+----------+-------------+-------------
test | t1 | arr | array[int8, 10] | 80 | | f | f |
(1 row)
Alternatively, you can set the binary size for the entire collection instead of setting bounds. Binary size is set either explicitly or from the DefaultArrayBinarySize configuration parameter. The following example creates an array column from the default, changes the default, and then uses ALTER TABLE to change it to the new default.
=> SELECT get_config_parameter('DefaultArrayBinarySize');
get_config_parameter
----------------------
100
(1 row)
=> CREATE TABLE test.t1 (arr array[int]);
CREATE TABLE
=> \d test.t1
List of Fields by Tables
Schema | Table | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key
--------+-------+--------+-----------------+------+---------+----------+-------------+-------------
test | t1 | arr | array[int8](96) | 96 | | f | f |
(1 row)
=> ALTER DATABASE DEFAULT SET DefaultArrayBinarySize=200;
ALTER DATABASE
=> ALTER TABLE test.t1 ALTER COLUMN arr SET DATA TYPE array[int];
ALTER TABLE
=> \d test.t1
List of Fields by Tables
Schema | Table | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key
--------+-------+--------+-----------------+------+---------+----------+-------------+-------------
test | t1 | arr | array[int8](200)| 200 | | f | f |
(1 row)
Alternatively, you can set the binary size explicitly instead of using the default value.
=> ALTER TABLE test.t1 ALTER COLUMN arr SET DATA TYPE array[int](300);
Purging historical data
You cannot reduce a column's width if Vertica retains any historical data that exceeds the new width. To reduce the column width, first remove that data from the table:
-
Advance the AHM to an epoch more recent than the historical data that needs to be removed from the table.
-
Purge the table of all historical data that precedes the AHM with the function
PURGE_TABLE
.
For example, given the previous example, you can update the data in column t.z
as follows:
=> UPDATE t SET z = '54321';
OUTPUT
--------
1
(1 row)
=> SELECT * FROM t;
x | y | z
---+-------+-------
1 | 12345 | 54321
(1 row)
Although no data in column z now exceeds 5 characters, Vertica retains the history of its earlier data, so attempts to reduce the column width to 5 return an error:
=> ALTER TABLE t ALTER COLUMN z SET DATA TYPE char(5);
ROLLBACK 2378: Cannot convert column "z" to type "char(5)"
HINT: Verify that the data in the column conforms to the new type
You can reduce the column width by purging the table's historical data as follows:
=> SELECT MAKE_AHM_NOW();
MAKE_AHM_NOW
-------------------------------
AHM set (New AHM Epoch: 6350)
(1 row)
=> SELECT PURGE_TABLE('t');
PURGE_TABLE
----------------------------------------------------------------------------------------------------------------------
Task: purge operation
(Table: public.t) (Projection: public.t_p1_b0)
(Table: public.t) (Projection: public.t_p1_b1)
(1 row)
=> ALTER TABLE t ALTER COLUMN z SET DATA TYPE char(5);
ALTER TABLE
2.2 - Working with column data conversions
Vertica conforms to the SQL standard by disallowing certain data conversions for table columns. However, you sometimes need to work around this restriction when you convert data from a non-SQL database. The following examples describe one such workaround, using the following table:
=> CREATE TABLE sales(id INT, price VARCHAR) UNSEGMENTED ALL NODES;
CREATE TABLE
=> INSERT INTO sales VALUES (1, '$50.00');
OUTPUT
--------
1
(1 row)
=> INSERT INTO sales VALUES (2, '$100.00');
OUTPUT
--------
1
(1 row)
=> COMMIT;
COMMIT
=> SELECT * FROM SALES;
id | price
----+---------
1 | $50.00
2 | $100.00
(2 rows)
To convert the price
column's existing data type from VARCHAR to NUMERIC, complete these steps:
-
Add a new column for temporary use. Assign the column a NUMERIC data type, and derive its default value from the existing price column.
Add a new column for temporary use
-
Add a column
temp_price
to tablesales
. You can use the new column temporarily, setting its data type to what you want (NUMERIC), and deriving its default value from theprice
column. Cast the default value for the new column to a NUMERIC data type and query the table:=> ALTER TABLE sales ADD COLUMN temp_price NUMERIC(10,2) DEFAULT SUBSTR(sales.price, 2)::NUMERIC; ALTER TABLE => SELECT * FROM SALES; id | price | temp_price ----+---------+------------ 1 | $50.00 | 50.00 2 | $100.00 | 100.00 (2 rows)
-
Use ALTER TABLE to drop the default expression from the new column
temp_price
. Vertica retains the values stored in this column:=> ALTER TABLE sales ALTER COLUMN temp_price DROP DEFAULT; ALTER TABLE
Drop the original price column
Drop the extraneous price
column. Before doing so, you must first advance the AHM to purge historical data that would otherwise prevent the drop operation:
-
Advance the AHM:
=> SELECT MAKE_AHM_NOW(); MAKE_AHM_NOW ------------------------------- AHM set (New AHM Epoch: 6354) (1 row)
-
Drop the original price column:
=> ALTER TABLE sales DROP COLUMN price CASCADE; ALTER COLUMN
Rename the new column to the original column
You can now rename the temp_price
column to price
:
-
Use
ALTER TABLE
to rename the column:=> ALTER TABLE sales RENAME COLUMN temp_price to price;
-
Query the sales table again:
=> SELECT * FROM sales; id | price ----+-------- 1 | 50.00 2 | 100.00 (2 rows)
3 - Defining column values
You can define a column so Vertica automatically sets its value from an expression through one of the following clauses:
DEFAULT default-expression |
Sets column values to
NoteAltering an existing table column to specify a DEFAULT expression has no effect on existing values in that column. Vertica applies the DEFAULT expression only on new rows when they are added to the table, through load operations such as INSERT and COPY. To refresh all values in a column with the column's DEFAULT expression, update the column as shown above. |
SET USING using-expression |
Sets the column value to using-expression when the function REFRESH_COLUMNS is invoked on that column. This approach is useful for large denormalized (flattened) tables, where multiple columns get their values by querying other tables. |
DEFAULT USING expression |
Sets DEFAULT and SET USING constraints on a column, equivalent to setting separate DEFAULT and SET USING constraints on the same column, where each constraint specifies the same expression. For example, the following column definitions are effectively identical:
DEFAULT USING columns support the same expressions as SET USING columns, and are subject to the same restrictions. |
Supported expressions
DEFAULT and SET USING generally support the same expressions. These include:
-
Queries
-
Other columns in the same table
-
Literals (constants)
-
All operators supported by Vertica
-
The following categories of functions:
Expression restrictions
The following restrictions apply to DEFAULT and SET USING expressions:
-
The return value data type must match or be cast to the column data type.
-
The expression must return a value that conforms to the column bounds. For example, a column that is defined as a
VARCHAR(1)
cannot be set to a default string ofabc
. -
In a temporary table, DEFAULT and SET USING do not support subqueries. If you try to create a temporary table where DEFAULT or SET USING use subquery expressions, Vertica returns an error.
-
A column's SET USING expression cannot specify another column in the same table that also sets its value with SET USING. Similarly, a column's DEFAULT expression cannot specify another column in the same table that also sets its value with DEFAULT, or whose value is automatically set to a sequence. However, a column's SET USING expression can specify another column that sets its value with DEFAULT.
Note
You can set a column's DEFAULT expression from another column in the same table that sets its value with SET USING. However, the DEFAULT column is typically set toNULL
, as it is only set on load operations that initially set the SET USING column toNULL
. -
DEFAULT and SET USING expressions only support one SELECT statement; attempts to include multiple SELECT statements in the expression return an error. For example, given table
t1
:=> SELECT * FROM t1; a | b ---+--------- 1 | hello 2 | world (2 rows)
Attempting to create table
t2
with the following DEFAULT expression returns with an error:=> CREATE TABLE t2 (aa int, bb varchar(30) DEFAULT (SELECT 'I said ')||(SELECT b FROM t1 where t1.a = t2.aa)); ERROR 9745: Expressions with multiple SELECT statements cannot be used in 'set using' query definitions
DEFAULT restrictions
DEFAULT expressions cannot specify volatile functions with ALTER TABLE...ADD COLUMN. To specify volatile functions, use CREATE TABLE or ALTER TABLE...ALTER COLUMN statements.
SET USING restrictions
The following restrictions apply to SET USING expressions:
-
Volatile functions are not allowed.
-
The expression cannot specify a sequence.
-
Vertica limits the use of several meta-functions that copy table data: COPY_TABLE, COPY_PARTITIONS_TO_TABLE, MOVE_PARTITIONS_TO_TABLE, and SWAP_PARTITIONS_BETWEEN_TABLES. The following table describes these limitations:
SET USING columns in... Limitation Source and target table All functions allowed only if each SET USING column in source table has a corresponding SET USING column in target table. Source table only SWAP_PARTITIONS_BETWEEN_TABLES disallowed. Target table only All functions disallowed.
Important
Several restrictions apply to Vertica's ability to refresh a SET USING column with REFRESH_COLUMNS.Disambiguating predicate columns
If a SET USING or DEFAULT query expression joins two columns with the same name, the column names must include their table names. Otherwise, Vertica assumes that both columns reference the dimension table, and the predicate always evaluates to true.
For example, tables orderFact and custDim both include column cid. Flattened table orderFact defines column cust_name with a SET USING query expression. Because the query predicate references columns cid from both tables, the column names are fully qualified:
=> CREATE TABLE public.orderFact
(
...
cid int REFERENCES public.custDim(cid),
cust_name varchar(20) SET USING (
SELECT name FROM public.custDim WHERE (custDIM.cid = orderFact.cid)),
...
)
Examples
Derive a column's default value from another column
- Create table
t
with two columns,date
andstate
, and insert a row of data:
=> CREATE TABLE t (date DATE, state VARCHAR(2));
CREATE TABLE
=> INSERT INTO t VALUES (CURRENT_DATE, 'MA');
OUTPUT
--------
1
(1 row)
=> COMMIT;
COMMMIT
SELECT * FROM t;
date | state
------------+-------
2017-12-28 | MA
(1 row)
- Use ALTER TABLE to add a third column that extracts the integer month value from column
date
:
=> ALTER TABLE t ADD COLUMN month INTEGER DEFAULT date_part('month', date);
ALTER TABLE
- When you query table
t
, Vertica returns the number of the month in columndate
:
=> SELECT * FROM t;
date | state | month
------------+-------+-------
2017-12-28 | MA | 12
(1 row)
Update default column values
- Update table
t
by subtracting 30 days fromdate
:
=> UPDATE t SET date = date-30;
OUTPUT
--------
1
(1 row)
=> COMMIT;
COMMIT
=> SELECT * FROM t;
date | state | month
------------+-------+-------
2017-11-28 | MA | 12
(1 row)
The value in month
remains unchanged.
- Refresh the default value in
month
from columndate
:
=> UPDATE t SET month=DEFAULT;
OUTPUT
--------
1
(1 row)
=> COMMIT;
COMMIT
=> SELECT * FROM t;
date | state | month
------------+-------+-------
2017-11-28 | MA | 11
(1 row)
Derive a default column value from user-defined scalar function
This example shows a user-defined scalar function that adds two integer values. The function is called add2ints
and takes two arguments.
-
Develop and deploy the function, as described in Scalar functions (UDSFs).
-
Create a sample table,
t1
, with two integer columns:
=> CREATE TABLE t1 ( x int, y int );
CREATE TABLE
- Insert some values into t1:
=> insert into t1 values (1,2);
OUTPUT
--------
1
(1 row)
=> insert into t1 values (3,4);
OUTPUT
--------
1
(1 row)
- Use ALTER TABLE to add a column to
t1
, with the default column value derived from the UDSFadd2ints
:
alter table t1 add column z int default add2ints(x,y);
ALTER TABLE
- List the new column:
select z from t1;
z
----
3
7
(2 rows)
Table with a SET USING column that queries another table for its values
- Define tables
t1
andt2
. Columnt2.b
is defined to get its data from columnt1.b
, through the query in its SET USING clause:
=> CREATE TABLE t1 (a INT PRIMARY KEY ENABLED, b INT);
CREATE TABLE
=> CREATE TABLE t2 (a INT, alpha VARCHAR(10),
b INT SET USING (SELECT t1.b FROM t1 WHERE t1.a=t2.a))
ORDER BY a SEGMENTED BY HASH(a) ALL NODES;
CREATE TABLE
Important
The definition for table t2
includes SEGMENTED BY and ORDER BY clauses that exclude SET USING column b
. If these clauses are omitted, Vertica creates an auto-projection for this table that specifies column b
in its SEGMENTED BY and ORDER BY clauses . Inclusion of a SET USING column in any projection's segmentation or sort order prevents function REFRESH_COLUMNS from populating this column. Instead, it returns with an error.
For details on this and other restrictions, see REFRESH_COLUMNS.
- Populate the tables with data:
=> INSERT INTO t1 VALUES(1,11);
=> INSERT INTO t1 VALUES(2,22);
=> INSERT INTO t1 VALUES(3,33);
=> INSERT INTO t1 VALUES(4,44);
=> INSERT INTO t2 VALUES(1,'aa');
=> INSERT INTO t2 VALUES(2,'bb');
=> COMMIT;
COMMIT
- View the data in table
t2
: Column in SET USING columnb
is empty, pending invocation of Vertica function REFRESH_COLUMNS:
=> SELECT * FROM t2;
a | alpha | b
---+-------+---
1 | aa |
2 | bb |
(2 rows)
- Refresh the column data in table
t2
by calling function REFRESH_COLUMNS:
=> SELECT REFRESH_COLUMNS ('t2','b', 'REBUILD');
REFRESH_COLUMNS
---------------------------
refresh_columns completed
(1 row)
In this example, REFRESH_COLUMNS is called with the optional argument REBUILD. This argument specifies to replace all data in SET USING column b
. It is generally good practice to call REFRESH_COLUMNS with REBUILD on any new SET USING column. For details, see REFRESH_COLUMNS.
- View data in refreshed column
b
, whose data is obtained from tablet1
as specified in the column's SET USING query:
=> SELECT * FROM t2 ORDER BY a;
a | alpha | b
---+-------+----
1 | aa | 11
2 | bb | 22
(2 rows)
Expressions with correlated subqueries
DEFAULT and SET USING expressions support subqueries that can obtain values from other tables, and use those with values in the current table to compute column values. The following example adds a column gmt_delivery_time
to fact table customer_orders
. The column specifies a DEFAULT expression to set values in the new column as follows:
- Calls meta-function NEW_TIME, which performs the following tasks:
-
Uses customer keys in
customer_orders
to query thecustomers
dimension table for customer time zones. -
Uses the queried time zone data to convert local delivery times to GMT.
- Populates the
gmt_delivery_time
column with the converted values.
=> CREATE TABLE public.customers(
customer_key int,
customer_name varchar(64),
customer_address varchar(64),
customer_tz varchar(5),
...);
=> CREATE TABLE public.customer_orders(
customer_key int,
order_number int,
product_key int,
product_version int,
quantity_ordered int,
store_key int,
date_ordered date,
date_shipped date,
expected_delivery_date date,
local_delivery_time timestamptz,
...);
=> ALTER TABLE customer_orders ADD COLUMN gmt_delivery_time timestamp
DEFAULT NEW_TIME(customer_orders.local_delivery_time,
(SELECT c.customer_tz FROM customers c WHERE (c.customer_key = customer_orders.customer_key)),
'GMT');