TransformFunctionFactory class

The TransformFunctionFactory class tells Vertica metadata about your UDTF: its number of parameters and their data types, as well as function properties and the data type of the return value.

The TransformFunctionFactory class tells Vertica metadata about your UDTF: its number of parameters and their data types, as well as function properties and the data type of the return value. It also instantiates a subclass of TransformFunction.

You must implement the following methods in your TransformFunctionFactory:

  • getPrototype() returns two ColumnTypes objects that describe the columns your UDTF takes as input and returns as output.

  • getReturnType() tells Vertica details about the output values: the width of variable-sized data types (such as VARCHAR) and the precision of data types that have settable precision (such as TIMESTAMP). You can also set the names of the output columns using this function. While this method is optional for UDxs that return single values, you must implement it for UDTFs.

  • createTransformFunction() instantiates your TransformFunction subclass.

For UDTFs written in C++ and Python, you can implement the getTransformFunctionProperties() method to set transform function class properties, including:

  • isExploder: By default False, indicates whether a single-phase UDTF performs a transform from one input row to a result set of N rows, often called a one-to-many transform. If set to True, each partition to the UDTF must consist of exactly one input row. When a UDTF is labeled as one-to-many, Vertica is able to optimize query plans and users can write SELECT queries that include any expression and do not require an OVER clause. For more information about UDTF partitioning options and instructions on how to set this class property, see Partitioning options for UDTFs. See Python example: explode for an in-depth example detailing a one-to-many UDTF.

For transform functions written in C++, you can provide information that can help with query optimization. See Improving query performance (C++ only).

API

The TransformFunctionFactory API provides the following methods for extension by subclasses:

virtual TransformFunction *
    createTransformFunction (ServerInterface &srvInterface)=0;

virtual void getPrototype(ServerInterface &srvInterface,
            ColumnTypes &argTypes, ColumnTypes &returnType)=0;

virtual void getReturnType(ServerInterface &srvInterface,
            const SizedColumnTypes &argTypes,
            SizedColumnTypes &returnType)=0;

virtual void getParameterType(ServerInterface &srvInterface,
            SizedColumnTypes &parameterTypes);

virtual void getTransformFunctionProperties(ServerInterface &srvInterface,
            const SizedColumnTypes &argTypes,
            Properties &properties);

The TransformFunctionFactory API provides the following methods for extension by subclasses:

public abstract TransformFunction createTransformFunction(ServerInterface srvInterface);

public abstract void getPrototype(ServerInterface srvInterface, ColumnTypes argTypes, ColumnTypes returnType);

public abstract void getReturnType(ServerInterface srvInterface, SizedColumnTypes argTypes,
        SizedColumnTypes returnType) throws UdfException;

public void getParameterType(ServerInterface srvInterface, SizedColumnTypes parameterTypes);

The TransformFunctionFactory API provides the following methods for extension by subclasses:

def createTransformFunction(self, srv)

def getPrototype(self, srv_interface, arg_types, return_type)

def getReturnType(self, srv_interface, arg_types, return_type)

def getParameterType(self, server_interface, parameterTypes)

def getTransformFunctionProperties(self, server_interface, arg_types)

Implement the Factory function API to define a transform function factory:

FunctionNameFactory <- function() {
  list(name    = FunctionName,
       udxtype = c("scalar"),
       intype  = c("int"),
       outtype = c("int"))
}