C++ 示例：排名

Rank 分析函数根据行的排序顺序对其进行排序。此 UDx 的 Java 版本包含在 /opt/vertica/sdk/examples 中。

加载和使用示例

以下示例显示了如何将函数加载至 Vertica 中。假设包含该函数的 AnalyticFunctions.so 库已复制到启动程序节点上数据库管理员用户的主目录中。

=> CREATE LIBRARY AnalyticFunctions AS '/home/dbadmin/AnalyticFunctions.so';
CREATE LIBRARY
=> CREATE ANALYTIC FUNCTION an_rank AS LANGUAGE 'C++'
   NAME 'RankFactory' LIBRARY AnalyticFunctions;
CREATE ANALYTIC FUNCTION

以下是使用此 rank 函数（名为 an_rank）的示例：

=> SELECT * FROM hits;
      site       |    date    | num_hits
-----------------+------------+----------
 www.example.com | 2012-01-02 |       97
 www.vertica.com | 2012-01-01 |   343435
 www.example.com | 2012-01-01 |      123
 www.example.com | 2012-01-04 |      112
 www.vertica.com | 2012-01-02 |   503695
 www.vertica.com | 2012-01-03 |   490387
 www.example.com | 2012-01-03 |      123
(7 rows)
=> SELECT site,date,num_hits,an_rank()
   OVER (PARTITION BY site ORDER BY num_hits DESC)
   AS an_rank FROM hits;
      site       |    date    | num_hits | an_rank
-----------------+------------+----------+---------
 www.example.com | 2012-01-03 |      123 |       1
 www.example.com | 2012-01-01 |      123 |       1
 www.example.com | 2012-01-04 |      112 |       3
 www.example.com | 2012-01-02 |       97 |       4
 www.vertica.com | 2012-01-02 |   503695 |       1
 www.vertica.com | 2012-01-03 |   490387 |       2
 www.vertica.com | 2012-01-01 |   343435 |       3
(7 rows)

与内置的 RANK 分析函数一样，在 ORDER BY 列（此示例中的 num_hits）具有相同值的行具有相同排名，但排名会持续增加，以便下一个具有不同 ORDER BY 键的行可基于其前面的行数获得排名值。

AnalyticFunction 实施

以下代码定义了一个名为 Rank 的 AnalyticFunction 子类。该子类基于 SDK 示例目录中分发的示例代码。

/**
 * User-defined analytic function: Rank - works mostly the same as SQL-99 rank
 * with the ability to define as many order by columns as desired
 *
 */
class Rank : public AnalyticFunction
{
    virtual void processPartition(ServerInterface &srvInterface,
                                  AnalyticPartitionReader &inputReader,
                                  AnalyticPartitionWriter &outputWriter)
    {
        // Always use a top-level try-catch block to prevent exceptions from
        // leaking back to Vertica or the fenced-mode side process.
        try {
            rank = 1; // The rank to assign a row
            rowCount = 0; // Number of rows processed so far
            do {
                rowCount++;
                // Do we have a new order by row?
                if (inputReader.isNewOrderByKey()) {
                    // Yes, so set rank to the total number of rows that have been
                    // processed. Otherwise, the rank remains the same value as
                    // the previous iteration.
                    rank = rowCount;
                }
                // Write the rank
                outputWriter.setInt(0, rank);
                // Move to the next row of the output
                outputWriter.next();
            } while (inputReader.next()); // Loop until no more input
        } catch(exception& e) {
            // Standard exception. Quit.
            vt_report_error(0, "Exception while processing partition: %s", e.what());
        }
    }
private:
    vint rank, rowCount;
};

在此示例中，processPartition() 方法实际上不读取输入行中的任何数据；只会遍历这些行。该方法不需要读取数据；它只需要计算已读取的行数并确定这些行是否具有与上一行相同的 ORDER BY 键。如果当前行为新的 ORDER BY 键，则排名设置为已处理的总行数。如果当前行与上一行的 ORDER BY 值相同，则排名保持不变。

请注意，此函数包含顶级 try-catch 块。所有 UDx 函数都应始终包含该块，以防止偶然发生的异常传递回 Vertica（如果在非隔离模式下运行函数）或从属进程。

AnalyticFunctionFactory 实施

以下代码定义了与 Rank 分析函数对应的 AnalyticFunctionFactory。

class RankFactory : public AnalyticFunctionFactory
{
    virtual void getPrototype(ServerInterface &srvInterface,
                                ColumnTypes &argTypes, ColumnTypes &returnType)
    {
        returnType.addInt();
    }
    virtual void getReturnType(ServerInterface &srvInterface,
                               const SizedColumnTypes &inputTypes,
                               SizedColumnTypes &outputTypes)
    {
        outputTypes.addInt();
    }
    virtual AnalyticFunction *createAnalyticFunction(ServerInterface
                                                        &srvInterface)
    { return vt_createFuncObj(srvInterface.allocator, Rank); }
};

RankFactory 子类定义的第一种方法 getPrototype() 设置了返回值的数据类型。因为 Rank UDAnF 不读取输入内容，因此不会通过对传入 argTypes 参数的 ColumnTypes 对象调用方法定义任何实参。

下一种方法是 getReturnType()。如果函数返回需要定义宽度或精度的数据类型，则 getReturnType() 方法的实施将对作为参数传入的 SizedColumnType 对象调用某个方法，以向 Vertica 说明该宽度或精度。 Rank 将返回固定宽度的数据类型 (INTEGER)，因此无需设置其输出的精度或宽度；它只是调用 addInt() 以报告其输出数据类型而已。

最后，RankFactory 定义了 createAnalyticFunction() 方法，该方法会返回一个 Vertica 可以调用的 AnalyticFunction 类的实例。此代码大部分是样板。您只需在对 vt_createFuncObj() 发出的调用中添加分析函数类的名称即可，此子类将为您分配对象。