FilterFactory class
If you write a filter, you must also write a filter factory to produce filter instances. To do so, subclass the FilterFactory
class.
Your subclass performs the initial validation and planning of the function execution and instantiates UDFilter
objects on each host that will be filtering data.
Filter factories are singletons. Your subclass must be stateless, with no fields containing data. The subclass also must not modify any global variables.
FilterFactory methods
The FilterFactory
class defines the following methods. Your subclass must override the prepare()
method. It may override the other methods.
Setting up
Vertica calls plan()
once on the initiator node, to perform the following tasks:
-
Check any parameters that have been passed from the function call in the COPY statement and error messages if there are any issues. You read the parameters by getting a
ParamReader
object from the instance ofServerInterface
passed into yourplan()
method. -
Store any information that the individual hosts need in order to filter data in the
PlanContext
instance passed as a parameter. For example, you could store details of the input format that the filter will read and output the format that the filter should produce. Theplan()
method runs only on the initiator node, and theprepare()
method runs on each host reading from a data source. Therefore, this object is the only means of communication between them.You store data in the
PlanContext
by getting aParamWriter
object from thegetWriter()
method. You then write parameters by calling methods on theParamWriter
such assetString
.Note
ParamWriter
offers only the ability to store simple data types. For complex types, you need to serialize the data in some manner and store it as a string or long string.
Creating filters
Vertica calls prepare()
to create and initialize your filter. It calls this method once on each node that will perform filtering. Vertica automatically selects the best nodes to complete the work based on available resources. You cannot specify the nodes on which the work is done.
Defining parameters
Implement getParameterTypes()
to define the names and types of parameters that your filter uses. Vertica uses this information to warn callers about unknown or missing parameters. Vertica ignores unknown parameters and uses default values for missing parameters. While you should define the types and parameters for your function, you are not required to override this method.
API
The FilterFactory API provides the following methods for extension by subclasses:
virtual void plan(ServerInterface &srvInterface, PlanContext &planCtxt);
virtual UDFilter * prepare(ServerInterface &srvInterface, PlanContext &planCtxt)=0;
virtual void getParameterType(ServerInterface &srvInterface, SizedColumnTypes ¶meterTypes);
After creating your FilterFactory
, you must register it with the RegisterFactory
macro.
The FilterFactory API provides the following methods for extension by subclasses:
public void plan(ServerInterface srvInterface, PlanContext planCtxt)
throws UdfException;
public abstract UDFilter prepare(ServerInterface srvInterface, PlanContext planCtxt)
throws UdfException;
public void getParameterType(ServerInterface srvInterface, SizedColumnTypes parameterTypes);
The FilterFactory API provides the following methods for extension by subclasses:
class PyFilterFactory(vertica_sdk.SourceFactory):
def __init__(self):
pass
def plan(self):
pass
def prepare(self, planContext):
#User implement the function to create PyUDSources.
pass