Loading UDxs
User-defined extensions (UDxs) are contained in libraries. A library can contain multiple UDxs. To add UDxs to Vertica, you must:
-
Deploy the library (once per library).
-
Create each UDx (once per UDx).
If you are using UDxs written in Java, you must also set up a Java runtime environment. See Installing Java on Vertica hosts.
Deploying libraries
To deploy a library to your Vertica database:
-
Copy the UDx shared library file (
.so
), Python file, Java JAR file, or R functions file that contains your function to a node on your Vertica cluster. You do not need to copy it to every node. -
Connect to the node where you copied the library (for example, using vsql).
-
Add your library to the database catalog using the CREATE LIBRARY statement.
=> CREATE LIBRARY libname AS '/path_to_lib/filename' LANGUAGE 'language';
libname
is the name you want to use to reference the library.path_to_lib/filename
is the fully-qualified path to the library or JAR file you copied to the host.language
is the implementation language.For example, if you created a JAR file named
TokenizeStringLib.jar
and copied it to the dbadmin account's home directory, you would use this command to load the library:=> CREATE LIBRARY tokenizelib AS '/home/dbadmin/TokenizeStringLib.jar' LANGUAGE 'Java';
You can load any number of libraries into Vertica.
Privileges
Superusers can create, modify, and drop any library. Users with the UDXDEVELOPER role or explicit grants can also act on libraries, as shown in the following table:
Operation | Requires |
---|---|
CREATE LIBRARY | UDXDEVELOPER |
Replace a library (CREATE OR REPLACE LIBRARY) |
UDXDEVELOPER and one of:
|
DROP LIBRARY |
UDXDEVELOPER and one of:
|
ALTER LIBRARY | UDXDEVELOPER and owner |
Creating UDx functions
After the library is loaded, define individual UDxs using SQL statements such as CREATE FUNCTION and CREATE SOURCE. These statements assign SQL function names to the extension classes in the library. They add the UDx to the database catalog and remain available after a database restart.
The statement you use depends on the type of UDx you are declaring, as shown in the following table:
UDx Type | SQL Statement |
---|---|
Aggregate Function (UDAF) | CREATE AGGREGATE FUNCTION |
Analytic Function (UDAnF) | CREATE ANALYTIC FUNCTION |
Scalar Function (UDSF) | CREATE FUNCTION (scalar) |
Transform Function (UDTF) | CREATE TRANSFORM FUNCTION |
Load (UDL): Source | CREATE SOURCE |
Load (UDL): Filter | CREATE FILTER |
Load (UDL): Parser | CREATE PARSER |
If a UDx of the given name already exists, you can replace it or instruct Vertica to not replace it. To replace it, use the OR REPLACE syntax, as in the following example:
=> CREATE OR REPLACE TRANSFORM FUNCTION tokenize
AS LANGUAGE 'C++' NAME 'TokenFactory' LIBRARY TransformFunctions;
CREATE TRANSFORM FUNCTION
You might want to replace an existing function to change between fenced and unfenced modes.
Alternatively, you can use IF NOT EXISTS to prevent the function from being created again if it already exists. You might want to use this in upgrade or test scripts that require, and therefore load, UDxs. By using IF NOT EXISTS, you preserve the original definition including fenced status. The following example shows this syntax:
--- original creation:
=> CREATE TRANSFORM FUNCTION tokenize
AS LANGUAGE 'C++' NAME 'TokenFactory' LIBRARY TransformFunctions NOT FENCED;
CREATE TRANSFORM FUNCTION
--- function is not replaced (and is still unfenced):
=> CREATE TRANSFORM FUNCTION IF NOT EXISTS tokenize
AS LANGUAGE 'C++' NAME 'TokenFactory' LIBRARY TransformFunctions FENCED;
CREATE TRANSFORM FUNCTION
After you add the UDx to the database, you can use your extension within SQL statements. The database superuser can grant access privileges to the UDx for users. See GRANT (user defined extension) for details.
When you call a UDx, Vertica creates an instance of the UDx class on each node in the cluster and provides it with the data it needs to process.