R 示例:kmeansPoly
以下示例显示了对一个或多个输入列执行 kmeans 聚类的转换函数 (UDTF) 的实施。
kmeansPoly <- function(v.data.frame,v.param.list) {
# Computes clusters using the kmeans algorithm.
#
# Input: A dataframe and a list of parameters.
# Output: A dataframe with one column that tells the cluster to which each data
# point belongs.
# Args:
# v.data.frame: The data from Vertica cast as an R data frame.
# v.param.list: List of function parameters.
#
# Returns:
# The cluster associated with each data point.
# Ensure k is not null.
if(!is.null(v.param.list[['k']])) {
number_of_clusters <- as.numeric(v.param.list[['k']])
} else {
stop("k cannot be NULL! Please use a valid value.")
}
# Run the kmeans algorithm.
kmeans_clusters <- kmeans(v.data.frame, number_of_clusters)
final.output <- data.frame(kmeans_clusters$cluster)
return(final.output)
}
kmeansFactoryPoly <- function() {
# This function tells Vertica the name of the R function,
# and the polymorphic parameters.
list(name=kmeansPoly, udxtype=c("transform"), intype=c("any"),
outtype=c("int"), parametertypecallback=kmeansParameters)
}
kmeansParameters <- function() {
# Callback function for the parameter types.
function.parameters <- data.frame(datatype=rep(NA, 1), length=rep(NA,1),
scale=rep(NA,1), name=rep(NA,1))
function.parameters[1,1] = "int"
function.parameters[1,4] = "k"
return(function.parameters)
}
多态 R 函数通过将 "any" 指定为 intype
形参的实参和可选的 outtype
形参,在其工厂函数中声明它可接受任何数量的实参。如果为 intype
或 outtype
定义 "any" 实参,则函数只能为相应的形参声明该类型。您不能先定义必需实参,然后再调用“any”将其余签名声明为可选实参。如果您的函数对其接受的实参有所要求,您的处理函数必须强制使用这些实参。
outtypecallback
方法用于指示与此方法一起调用的实参类型和数量,并且需要指示函数所返回的类型和数量。outtypecallback
方法还可以用于检查不受支持的实参类型和/或数量。例如,函数可能只需要最多 10 个整数:
您使用与将某个 SQL 名称分配给一个非多态 UDx 相同的语句将一个 SQL 名称分配给您的多态 UDx。以下语句显示了如何从示例中加载和调用多态函数。
=> CREATE LIBRARY rlib2 AS '/home/dbadmin/R_UDx/poly_kmeans.R' LANGUAGE 'R';
CREATE LIBRARY
=> CREATE TRANSFORM FUNCTION kmeansPoly AS LANGUAGE 'R' name 'kmeansFactoryPoly' LIBRARY rlib2;
CREATE FUNCTION
=> SELECT spec, kmeansPoly(sl,sw,pl,pw USING PARAMETERS k = 3)
OVER(PARTITION BY spec) AS Clusters
FROM iris;
spec | Clusters
-----------------+----------
Iris-setosa | 1
Iris-setosa | 1
Iris-setosa | 1
Iris-setosa | 1
.
.
.
(150 rows)