Scalar functions (UDSFs)
A user-defined scalar function (UDSF) returns a single value for each row of data it reads. You can use a UDSF anywhere you can use a built-in Vertica function. You usually develop a UDSF to perform data manipulations that are too complex or too slow to perform using SQL statements and functions. UDSFs also let you use analytic functions provided by third-party libraries within Vertica while still maintaining high performance.
A UDSF returns a single column. You can automatically return multiple values in a ROW. A ROW is a group of property-value pairs. In the following example, div_with_rem is a UDSF that performs a division operation, returning the quotient and remainder as integers:
=> SELECT div_with_rem(18,5);
div_with_rem
------------------------------
{"quotient":3,"remainder":3}
(1 row)
A ROW returned from a UDSF cannot be used as an argument to COUNT.
Alternatively, you can construct a complex return value yourself, as described in Complex Types as Arguments.
Your UDSF must return a value for every input row (unless it generates an error; see Handling errors for details). Failure to return a value for an input row results in incorrect results and potentially destabilizes the Vertica server if not run in Fenced and unfenced modes.
A UDSF can have up to 9800 arguments.
In this section
- ScalarFunction class
- ScalarFunctionFactory class
- Setting null input and volatility behavior
- Improving query performance (C++ only)
- C++ example: Add2Ints
- Python example: currency_convert
- Python example: validate_url
- Python example: matrix multiplication
- R example: SalesTaxCalculator
- R example: kmeans
- C++ example: using complex types
- C++ example: returning multiple values
- C++ example: calling a UDSF from a check constraint