This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
Resource use for C++ UDxs
Your UDxs consume at least a small amount of memory by instantiating classes and creating local variables.
Your UDxs consume at least a small amount of memory by instantiating classes and creating local variables. This basic memory usage by UDxs is small enough that you do not need to be concerned about it.
If your UDx needs to allocate more than one or two megabytes of memory for data structures, or requires access to additional resources such as files, you must inform Vertica about its resource use. Vertica can then ensure that the resources your UDx requires are available before running a query that uses it. Even moderate memory use (10MB per invocation of a UDx, for example) can become an issue if there are many simultaneous queries that call it.
If your UDx allocates its own memory, you must make
absolutely sure it properly frees it. Failing to free even a single byte of allocated memory can have significant consequences at scale. Instead of having your code allocate its own memory, you should use the C++
macro, which uses Vertica's own memory manager to allocate and track memory. This memory is guaranteed to be properly disposed of when your UDx completes execution. See
Allocating resources for UDxs for more information.
1 - Allocating resources for UDxs
You have two options for allocating memory and file handles for your user-defined extensions (UDxs):.
You have two options for allocating memory and file handles for your user-defined extensions (UDxs):
Use Vertica SDK macros to allocate resources. This is the best method, since it uses Vertica's own resource manager, and guarantees that resources used by your UDx are reclaimed. See Allocating resources with the SDK macros.
While not the recommended option, you can allocate resources in your UDxs yourself using standard C++ methods (instantiating objects using new
, allocating memory blocks using malloc()
, etc.). You must manually free these resources before your UDx exits.
You must be extremely careful if you choose to allocate your own resources in your UDx. Failing to free resources properly will have significant negative impact, especially if your UDx is running in unfenced mode.
Whichever method you choose, you usually allocate resources in a function named setup()
in your UDx class. This function is called after your UDx function object is instantiated, but before Vertica calls it to process data.
If you allocate memory on your own in the setup()
function, you must free it in a corresponding function named destroy()
. This function is called after your UDx has performed all of its processing. This function is also called if your UDx returns an error (see Handling errors).
Always use the setup()
and destroy()
functions to allocate and free resources instead of your own constructors and destructors. The memory for your UDx object is allocated from one of Vertica's own memory pools. Vertica always calls your UDx's destroy()
function before it deallocates the object's memory. There is no guarantee that your UDx's destructor is will be called before the object is deallocated. Using the destroy()
function ensures that your UDx has a chance to free its allocated resources before it is destroyed.
The following code fragment demonstrates allocating and freeing memory using a setup()
and destroy()
2 - Allocating resources with the SDK macros
The Vertica SDK provides three macros to allocate memory:.
The Vertica SDK provides three macros to allocate memory:
allocates a block of memory to fit a specific data type (vint, struct, etc.).
allocates a block of memory to hold an array of a specific data type.
allocates an arbitrarily-sized block of memory.
All of these macros allocate their memory from memory pools managed by Vertica. The main benefit of allowing Vertica to manage your UDx's memory is that the memory is automatically reclaimed after your UDx has finished. This ensures there is no memory leaks in your UDx.
Because Vertica frees this memory automatically, do not attempt to free any of the memory you allocate through any of these macros. Attempting to free this memory results in run-time errors.
3 - Informing Vertica of resource requirements
When you run your UDx in fenced mode, Vertica monitors its use of memory and file handles.
When you run your UDx in fenced mode, Vertica monitors its use of memory and file handles. If your UDx uses more than a few megabytes of memory or any file handles, it should tell Vertica about its resource requirements. Knowing the resource requirements of your UDx allows Vertica to determine whether it can run the UDx immediately or needs to queue the request until enough resources become available to run it.
Determining how much memory your UDx requires can be difficult in some cases. For example, if your UDx extracts unique data elements from a data set, there is potentially no bound on the number of data items. In this case, a useful technique is to run your UDx in a test environment and monitor its memory use on a node as it handles several differently-sized queries, then extrapolate its memory use based on the worst-case scenario it may face in your production environment. In all cases, it's usually a good idea to add a safety margin to the amount of memory you tell Vertica your UDx uses.
The information on your UDx's resource needs that you pass to Vertica is used when planning the query execution. There is no way to change the amount of resources your UDx requests from Vertica while the UDx is actually running.
Your UDx informs Vertica of its resource needs by implementing the getPerInstanceResources()
function in its factory class (see Vertica::UDXFactory::getPerInstanceResources()
in the SDK documentation). If your UDx's factory class implements this function, Vertica calls it to determine the resources your UDx requires.
The getPerInstanceResources()
function receives an instance of the Vertica::VResources
struct. This struct contains fields that set the amount of memory and the number of file handles your UDx needs. By default, the Vertica server allocates zero bytes of memory and 100 file handles for each instance of your UDx.
Your implementation of the getPerInstanceResources()
function sets the fields in the VResources
struct based on the maximum resources your UDx may consume for each instance of the UDx function. So, if your UDx's processBlock()
function creates a data structure that uses at most 100MB of memory, your UDx must set the VResources.scratchMemory
field to at least 104857600 (the number of bytes in 100MB). Leave yourself a safety margin by increasing the number beyond what your UDx should normally consume. In this example, allocating 115000000 bytes (just under 110MB) is a good idea.
The following ScalarFunctionFactory
class demonstrates calling getPerInstanceResources()
to inform Vertica about the memory requirements of the MemoryAllocationExample
class shown in Allocating resources for UDxs. It tells Vertica that the UDSF requires 510MB of memory (which is a bit more than the UDSF actually allocates, to be on the safe size).
4 - Setting memory limits for fenced-mode UDxs
Vertica calls a fenced-mode UDx's implementation of Vertica::UDXFactory::getPerInstanceResources() to determine if there are enough free resources to run the query containing the UDx (see Informing [%=Vertica.DBMS_SHORT%] of Resource Requirements).
Vertica calls a fenced-mode UDx's implementation of Vertica::UDXFactory::getPerInstanceResources()
to determine if there are enough free resources to run the query containing the UDx (see Informing Vertica of resource requirements). Since these reports are not generated by actual memory use, they can be inaccurate. Once started by Vertica, a UDx could allocate far more memory or file handles than it reported it needs.
The FencedUDxMemoryLimitMB configuration parameter lets you create an absolute memory limit for UDxs. Any attempt by a UDx to allocate more memory than this limit results in a bad_alloc
exception. For an example of setting FencedUDxMemoryLimitMB, see How resource limits are enforced.
5 - How resource limits are enforced
Before running a query, Vertica determines how much memory it requires to run.
Before running a query, Vertica determines how much memory it requires to run. If the query contains a fenced-mode UDx which implements the getPerInstanceResources()
function in its factory class, Vertica calls it to determine the amount of memory the UDx needs and adds this to the total required for the query. Based on these requirements, Vertica decides how to handle the query:
If the total amount of memory required (including the amount that the UDxs report that they need) is larger than the session's MEMORYCAP or resource pool's MAXMEMORYSIZE setting, Vertica rejects the query. For more information about resource pools, see Resource pool architecture.
If the amount of memory is below the limit set by the session and resource pool limits, but there is currently not enough free memory to run the query, Vertica queues it until enough resources become available.
If there are enough free resources to run the query, Vertica executes it.
Vertica has no other way to determine the amount of resources a UDx requires other than the values it reports using the
function. A UDx could use more resources than it claims, which could cause performance issues for other queries that are denied resources. You can set an absolute limit on the amount of memory UDxs can allocate. See
Setting memory limits for fenced-mode UDxs for more information.
If the process executing your UDx attempts to allocate more memory than the limit set by the FencedUDxMemoryLimitMB configuration parameter, it receives a bad_alloc exception. For more information about FencedUDxMemoryLimitMB, see Setting memory limits for fenced-mode UDxs.
Below is the output of loading a UDSF that consumes 500MB of memory, then changing the memory settings to cause out-of-memory errors. The MemoryAllocationExample UDSF in the following example is just the Add2Ints UDSF example altered as shown in Allocating resources for UDxs and Informing Vertica of resource requirements to allocate 500MB of RAM.
=> CREATE LIBRARY mylib AS '/home/dbadmin/';
=> CREATE FUNCTION usemem AS NAME 'MemoryAllocationExampleFactory' LIBRARY mylib
=> SELECT usemem(1,2);
(1 row)
The following statements demonstrate setting the session's MEMORYCAP to lower than the amount of memory that the UDSF reports it uses. This causes Vertica to return an error before it executes the UDSF.
=> SELECT usemem(1,2);
ERROR 3596: Insufficient resources to execute plan on pool sysquery
[Request exceeds session memory cap: 520328KB > 102400KB]
The resource pool can also prevent a UDx from running if it requires more memory than is available in the pool. The following statements demonstrate the effect of creating and using a resource pool that has too little memory for the UDSF to run. Similar to the session's MAXMEMORYCAP limit, the pool's MAXMEMORYSIZE setting prevents Vertica from executing the query containing the UDSF.
=> CREATE TABLE ExampleTable(a int, b int);
=> INSERT /*+direct*/ INTO ExampleTable VALUES (1,2);
(1 row)
=> SELECT usemem(a, b) FROM ExampleTable;
ERROR 3596: Insufficient resources to execute plan on pool small
[Request Too Large:Memory(KB) Exceeded: Requested = 523136, Free = 102400 (Limit = 102400, Used = 0)]
=> DROP RESOURCE POOL small; --Dropping the pool resets the session's pool
Finally, setting the FencedUDxMemoryLimitMB configuration parameter to lower than the UDx actually allocates results in the UDx throwing an exception. This is a different case than either of the previous two examples, since the query actually executes. The UDx's code needs to catch and handle the exception. In this example, it uses the vt_report_error
macro to report the error back to Vertica and exit.
=> SELECT usemem(1,2);
ERROR 3412: Failure in UDx RPC call InvokeSetup(): Error calling setup() in
User Defined Object [usemem] at [MemoryAllocationExample.cpp:32], error code:
1, message: Couldn't allocate memory :[std::bad_alloc]
=> SELECT usemem(1,2);
(1 row)
See also