Setting up a Python development environment
To avoid problems when loading and executing your UDxs, develop your UDxs using the same version of Python that Vertica uses. To do this without changing your environment for projects that might require other Python versions, you can use a Python virtual environment (venv). You can install libraries that your UDx depends on into your venv
and use that path when you create your UDx library with CREATE LIBRARY.
Setting up venv
Set up venv
using the Python version bundled with Vertica. If you have direct access to a database node, you can use that Python binary directly to create your venv
:
$ /opt/vertica/sbin/python3 -m venv /path/to/new/environment
The result is a directory with a default environment, including a site-packages
directory:
$ ls venv/lib/
python3.9
$ ls venv/lib/python3.9/
site-packages
If your UDx depends on libraries that are not packaged with Vertica, install them into this directory:
$ source venv/bin/activate
(venv) $ pip install numpy
...
The lib/python3.9/site-packages
directory now contains the installed library. The change affects only your virtual environment.
UDx imports
Your UDx code must import, in addition to any libraries you add, the vertica_sdk
library:
# always required:
import vertica_sdk
# other libs:
import numpy as np
# ...
The vertica_sdk
library is included as a part of the Vertica server. You do not need to add it to site-packages
or declare it as a dependency.
Deployment
For libraries you add, you must declare dependencies when using CREATE LIBRARY. This declaration allows Vertica to find the libraries and distribute them to all database nodes. You can supply a path instead of enumerating the libraries:
=> CREATE OR REPLACE LIBRARY pylib AS
'/path/to/udx/add2ints.py'
DEPENDS '/path/to/new/environment/lib/python3.9/site-packages/*'
LANGUAGE 'Python';
=> CREATE OR REPLACE FUNCTION add2ints AS LANGUAGE 'Python'
NAME 'add2ints_factory' LIBRARY pylib;
CREATE LIBRARY copies the UDx and the contents of the DEPENDS path and stores them with the database. Vertica then distributes copies to all database nodes.