AnalyticFunction class

The AnalyticFunction class performs the analytic processing.

The AnalyticFunction class performs the analytic processing. Your subclass must define the processPartition() method to perform the operation. It may define methods to set up and tear down the function.

Performing the operation

The processPartition() method reads a partition of data, performs some sort of processing, and outputs a single value for each input row.

Vertica calls processPartition() once for each partition of data. It supplies the partition using an AnalyticPartitionReader object from which you read its input data. In addition, there is a unique method on this object named isNewOrderByKey(), which returns a Boolean value indicating whether your function has seen a row with the same ORDER BY key (or keys). This method is very useful for analytic functions (such as the example RANK function) which need to handle rows with identical ORDER BY keys differently than rows with different ORDER BY keys.

Note

You can specify multiple ORDER BY columns in the SQL query you use to call your UDAnF. The isNewOrderByKey method returns true if any of the ORDER BY keys are different than the previous row.

Once your method has finished processing the row of data, you advance it to the next row of input by calling next() on AnalyticPartitionReader.

Your method writes its output value using an AnalyticPartitionWriter object that Vertica supplies as a parameter to processPartition(). This object has data-type-specific methods to write the output value (such as setInt()). After setting the output value, call next() on AnalyticPartitionWriter to advance to the next row in the output.

Note

You must be sure that your function produces a row of output for each row of input in the partition. You must also not output more rows than are in the partition, otherwise the zygote size process (if running in Fenced and unfenced modes) or Vertica itself could generate an out of bounds error.

Setting up and tearing down

The AnalyticFunction class defines two additional methods that you can optionally implement to allocate and free resources: setup() and destroy(). You should use these methods to allocate and deallocate resources that you do not allocate through the UDx API (see Allocating resources for UDxs for details).

The AnalyticFunction API provides the following methods for extension by subclasses:

virtual void setup(ServerInterface &srvInterface,
        const SizedColumnTypes &argTypes);

virtual void processPartition (ServerInterface &srvInterface,
        AnalyticPartitionReader &input_reader,
        AnalyticPartitionWriter &output_writer)=0;

virtual void cancel(ServerInterface &srvInterface);

virtual void destroy(ServerInterface &srvInterface, const SizedColumnTypes &argTypes);

The AnalyticFunction API provides the following methods for extension by subclasses:

public void setup(ServerInterface srvInterface, SizedColumnTypes argTypes);

public abstract void processPartition (ServerInterface srvInterface,
        AnalyticPartitionReader input_reader, AnalyticPartitionWriter output_writer)
        throws UdfException, DestroyInvocation;

protected void cancel(ServerInterface srvInterface);

public void destroy(ServerInterface srvInterface, SizedColumnTypes argTypes);