This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Python SDK

The OpenText™ Analytics Database SDK supports writing UDxs of some types in Python 3.

The OpenText™ Analytics Database SDK supports writing UDxs of some types in Python 3.

The Python SDK does not require any additional system configuration or header files. This low overhead allows you to develop and deploy new capabilities to your database cluster in a short amount of time.

The following workflow is typical for the Python SDK:

Because Python has an interpreter, you do not have to compile your program before loading the UDx in the database. However, you should expect to do some debugging of your code after you create your function and begin testing it in the database.

When the database calls your UDx, it starts a side process that manages the interaction between the server and the Python interpreter.

This section covers Python-specific topics that apply to all UDx types. For information that applies to all languages, see Arguments and return values, UDx parameters, Errors, warnings, and logging, Handling cancel requests and the sections for specific UDx types. For full API documentation, see the Python SDK.

1 - Setting up a Python development environment

To avoid problems when loading and executing your UDxs, develop your UDxs using the same version of Python that OpenText™ Analytics Database uses.

To avoid problems when loading and executing your UDxs, develop your UDxs using the same version of Python that OpenText™ Analytics Database uses. To do this without changing your environment for projects that might require other Python versions, you can use a Python virtual environment (venv). You can install libraries that your UDx depends on into your venv and use that path when you create your UDx library with CREATE LIBRARY.

Setting up venv

Set up venv using the Python version bundled with OpenText™ Analytics Database. If you have direct access to a database node, you can use that Python binary directly to create your venv:

$ /opt/vertica/sbin/python3 -m venv /path/to/new/environment

The result is a directory with a default environment, including a site-packages directory:

$ ls venv/lib/
python3.9
$ ls venv/lib/python3.9/
site-packages

If your UDx depends on libraries that are not packaged with the database, install them into this directory:

$ source venv/bin/activate
(venv) $ pip install numpy
...

The lib/python3.9/site-packages directory now contains the installed library. The change affects only your virtual environment.

UDx imports

Your UDx code must import, in addition to any libraries you add, the vertica_sdk library:

# always required:
import vertica_sdk
# other libs:
import numpy as np
# ...

The vertica_sdk library is included as a part of the database server. You do not need to add it to site-packages or declare it as a dependency.

Deployment

For libraries you add, you must declare dependencies when using CREATE LIBRARY. This declaration allows the database to find the libraries and distribute them to all database nodes. You can supply a path instead of enumerating the libraries:

=> CREATE OR REPLACE LIBRARY pylib AS
   '/path/to/udx/add2ints.py'
   DEPENDS '/path/to/new/environment/lib/python3.9/site-packages/*'
   LANGUAGE 'Python';

=> CREATE OR REPLACE FUNCTION add2ints AS LANGUAGE 'Python'
   NAME 'add2ints_factory' LIBRARY pylib;

CREATE LIBRARY copies the UDx and the contents of the DEPENDS path and stores them with the database. The database then distributes copies to all database nodes.

2 - Python and OpenText Analytics Database data types

The OpenText™ Analytics Database Python SDK converts native database data types into the appropriate Python data types.

The OpenText™ Analytics Database Python SDK converts native database data types into the appropriate Python data types. The following table describes some of the data type conversions. Consult the Python SDK for a complete list, as well as lists of helper functions to convert and manipulate these data types.

For information about SDK support for complex data types, see Complex Types as Arguments and Return Values.

OpenText™ Analytics Database Data Type Python Data Type
INTEGER int
FLOAT float
NUMERIC decimal.Decimal
DATE datetime.date
CHAR, VARCHAR, LONG VARCHAR string (UTF-8 encoded)
BINARY, VARBINARY, LONG VARBINARY binary
TIMESTAMP datetime.datetime
TIME datetime.time
ARRAY

list

Note: Nested ARRAY types are also converted into lists.

ROW

collections.OrderedDict

Note: Nested ROW types are also converted into collections.OrderedDicts.