Python example: validate_url

The validate_url scalar function reads a string from a table, a URL.

The validate_url scalar function reads a string from a table, a URL. It then validates if the URL is responsive, returning a status code or a string indicating the attempt failed.

You can find more UDx examples in the Vertica Github repository,

UDSF Python code

import vertica_sdk
import urllib.request
import time

class validate_url(vertica_sdk.ScalarFunction):
    """Validates HTTP requests.

    Returns the status code of a webpage. Pages that cannot be accessed return
    "Failed to load page."


    def __init__(self):

    def setup(self, server_interface, col_types):

    def processBlock(self, server_interface, arg_reader, res_writer):
        # Writes a string to the UDx log file.
        server_interface.log("Validating webpage accessibility - UDx")

            url = arg_reader.getString(0)
                status = urllib.request.urlopen(url).getcode()
                # Avoid overwhelming web servers -- be nice.
            except (ValueError, urllib.error.HTTPError, urllib.error.URLError):
                status = 'Failed to load page'
            if not
                # Stop processing when there are no more input rows.

    def destroy(self, server_interface, col_types):

class validate_url_factory(vertica_sdk.ScalarFunctionFactory):

    def createScalarFunction(self, srv):
        return validate_url()

    def getPrototype(self, srv_interface, arg_types, return_type):

    def getReturnType(self, srv_interface, arg_types, return_type):

Load the function and library

Create the library and the function.

=> CREATE OR REPLACE LIBRARY pylib AS 'webpage_tester/' LANGUAGE 'Python';
=> CREATE OR REPLACE FUNCTION validate_url AS LANGUAGE 'Python' NAME 'validate_url_factory' LIBRARY pylib fenced;

Querying data with the function

The following query shows how you can run a query with the UDSF.

=> SELECT url, validate_url(url) AS url_status FROM webpages;
                     url                       |      url_status
-----------------------------------------------+---------------------- | 200                        | 200                      | Failed to load page                           | 200
 http://blah.blah.blah.blah                    | Failed to load page                       | 200
(6 rows)