Creating a polymorphic UDx

Polymorphic UDxs accept any number and type of argument that the user supplies.

Polymorphic UDxs accept any number and type of argument that the user supplies. Transform functions (UDTFs), analytic functions (UDAnFs), and aggregate functions (UDAFs) can define their output return types at runtime, usually based on the input arguments. For example, a UDTF that adds two numbers could return an integer or a float, depending on the input types.

Vertica does not check the number or types of argument that the user passes to the UDx—it just passes the UDx all of the arguments supplied by the user. It is up to your polymorphic UDx's main processing function (for example, processBlock() in user-defined scalar functions) to examine the number and types of arguments it received and determine if it can handle them. UDxs support up to 9800 arguments.

Polymorphic UDxs are more flexible than using multiple factory classes for your function (see Overloading your UDx). They also allow you to write more concise code, instead of writing versions for each data type. The tradeoff is that your polymorphic function needs to perform more work to determine whether it can process its arguments.

Your polymorphic UDx declares that it accepts any number of arguments in its factory's getPrototype() function by calling the addAny() function on the ColumnTypes object that defines its arguments, as follows:

    // C++ example
    void getPrototype(ServerInterface &srvInterface,
                      ColumnTypes &argTypes,
                      ColumnTypes &returnType)
    {
        argTypes.addAny(); // Must be only argument type.
        returnType.addInt(); // or whatever the function returns
    }

This "any parameter" argument type is the only one that your function can declare. You cannot define required arguments and then call addAny() to declare the rest of the signature as optional. If your function has requirements for the arguments it accepts, your process() function must enforce them.

The getPrototype() example shown previously accepts any type and declares that it returns an integer. The following example shows a version of the method that defers resolving the return type until runtime. You can only use the "any" return type for transform and analytic functions.

    void getPrototype(ServerInterface &srvInterface,
                      ColumnTypes &argTypes,
                      ColumnTypes &returnType)
    {
        argTypes.addAny();
        returnType.addAny(); // type determined at runtime
    }

If you use polymorphic return types, you must also define getReturnType() in your factory. This function is called at runtime to determine the actual return type. See C++ example: PolyNthValue for an example.

Polymorphic UDxs and schema search paths

If a user does not supply a schema name as part of a UDx call, Vertica searches each schema in the schema search path for a function whose name and signature match the function call. See Setting search paths for more information about schema search paths.

Because polymorphic UDxs do not have specific signatures associated with them, Vertica initially skips them when searching for a function to handle the function call. If none of the schemas in the search path contain a UDx whose name and signature match the function call, Vertica searches the schema search path again for a polymorphic UDx whose name matches the function name in the function call.

This behavior gives precedence to a UDx whose signature exactly matches the function call. It allows you to create a "catch-all" polymorphic UDx that Vertica calls only when none of the non-polymorphic UDxs with the same name have matching signatures.

This behavior may cause confusion if your users expect the first polymorphic function in the schema search path to handle a function call. To avoid confusion, you should:

  • Avoid using the same name for different UDxs. You should always uniquely name UDxs unless you intend to create an overloaded UDx with multiple signatures.

  • When you cannot avoid having UDxs with the same name in different schemas, always supply the schema name as part of the function call. Using the schema name prevents ambiguity and ensures that Vertica uses the correct UDx to process your function calls.