Perl 数据类型和 Vertica 数据类型之间的转换

Perl 是一种弱类型编程语言,这种语言不为值分配特定的数据类型。这种语言基于对值执行的操作在字符串和数值之间转换。因此,Perl 在从 Vertica 提取大部分字符串和数值数据类型时遇到的问题较小。所有时间间隔数据类型(DATE 和 TIMESTAMP 等)会转换为字符串。可以使用多个不同的 Perl 日期和时间处理模块在脚本中处理这些值。

Vertica NULL 值会转换为 Perl 的未定义的值 (undef)。从包含 NULL 值的列读取数据时,始终应在使用某个值之前测试是否已定义该值。

将数据插入到 Vertica 时,Perl DBI 模块会尝试将数据强制转换为正确格式。默认情况下,该模块假设列值为 VARCHAR,除非它能够确定这些列值为某种其他数据类型。如果提供了一个字符串值以插入到具有整数或数值数据类型的列,DBI 会尝试将该字符串的内容转换为正确的数据类型。如果整个字符串可以转换为适当数据类型的值,该模块会将值插入到列中。否则,插入数据行将失败。

当插入到 FLOAT、NUMERIC 或类似数据类型的列时,DBI 以透明方式将整数值转换为数值或浮点值。仅当不会降低精度(小数点右边的值为 0)时,该模块才会将数值或浮点值转换为整数。例如,该模块可以将值 3.0 插入到 INTEGER 列中,因为将该值转换为整数时不会降低精度。该模块无法将 3.1 插入到 INTEGER 列中,因为这样会降低精度。该模块将返回错误,而非将该值截断为 3。

以下示例演示了 DBI 模块在将数据插入到 Vertica 时执行的一些转换。

#!/usr/bin/perl
use strict;
use DBI;
# Create a hash reference that holds a hash of parameters for the
# connection.
 my $attr = {AutoCommit => 0, # Turn off autocommit
             PrintError => 0   # Turn off print error. Manually handled
             };
# Open a connection using a DSN. Supply the username and password.
my $dbh = DBI->connect("dbi:ODBC:VerticaDSN","ExampleUser","password123",
    $attr);
if (defined DBI::err) {
    # Conection failed.
    die "Failed to connect: $DBI::errstr";
}
print "Connection AutoCommit state is: " . $dbh->{AutoCommit} . "\n";
# Create table to hold inserted data
$dbh->do("DROP TABLE IF EXISTS TEST CASCADE;");
$dbh->do("CREATE TABLE TEST( \
               C_ID  INT, \
               C_FP  FLOAT,\
               C_VARCHAR VARCHAR(100),\
               C_DATE DATE, C_TIME TIME,\
               C_TS TIMESTAMP,\
               C_BOOL BOOL)");
# Populate an array of arrays with values.
my @data = (
                # Start with matching data types
                [1,1.111,'Matching datatypes','2001-01-01','01:01:01'
                    ,'2001-01-01 01:01:01','t'],
                # Force floats -> int and int -> float.
                [2.0,2,"Ints <-> floats",'2002-02-02','02:02:02'
                    ,'2002-02-02 02:02:02',1],
                # Float -> int *only* works when there is no loss of precision.
                # this row will fail to insert:
                [3.1,3,"float -> int with trunc?",'2003-03-03','03:03:03'
                    ,'2003-03-03 03:03:03',1],
                # String values are converted into numbers
                ["4","4.4","Strings -> numbers", '2004-04-04','04:04:04',
                    ,'2004-04-04 04:04:04',0],
                # String -> numbers only works if the entire string can be
                # converted into a number
                ["5 and a half","5.5","Strings -> numbers", '2005-05-05',
                    '05:05:05', ,'2005-05-05 05:05:05',0],
                # Number are converted into string values automatically,
                # assuming they fit into the column width.
                [6,6.6,3.14159, '2006-06-06','06:06:06',
                    ,'2006-06-06 06:06:06',0],
                # There are some variations in the accepted date strings
                [7,7.7,'Date/time formats', '07/07/2007','07:07:07',
                    ,'07-07-2007 07:07:07',1],
             );
# Create a prepared statement to use parameters for inserting values.
my $sth = $dbh->prepare_cached("INSERT into test values(?,?,?,?,?,?,?)");
my $rowcount = 0; # Count # of rows
# Loop through the arrays to insert values
foreach my $tuple (@data) {
    $rowcount++;
    # Insert the row
    my $retval = $sth->execute(@$tuple);

    # See if the row was successfully inserted.
    if ($retval == 1) {
        # Value of 1 means the row was inserted (1 row was affected by insert)
        print "Row $rowcount successfully inserted\n";
    } else {
        print "Inserting row $rowcount failed with error " .
                $sth->state . " " . $sth->errstr . "\n";
    }
}
# Commit the data
$dbh->commit();
# Prepare a query to get the content of the table
$sth = $dbh->prepare("SELECT * FROM TEST ORDER BY C_ID ASC");
$sth->execute() or die "Error: " . $dbh->errstr;
my @row; # Need to pre-declare to use in the format statement.
# Use Perl formats to pretty print the output.
format STDOUT_TOP =
Int  Float          VarChar        Date      Time      Timestamp     Bool
===  =====  ==================  ========== ======== ================ ====
.
format STDOUT =
@>>  @<<<<  @<<<<<<<<<<<<<<<<<  @<<<<<<<<< @<<<<<<< @<<<<<<<<<<<<<<< @<<<<
@row
.
# Loop through result rows while we have them
while (@row = $sth->fetchrow_array()) {
         write; # Format command does the work of extracting the columns from
                 # the array.
}
# Commit to stop Perl complaining about in-progress transactions.
$dbh->commit();
$dbh->disconnect();

运行该示例后,将生成以下输出:

Connection AutoCommit state is: 0
Row 1 successfully inserted
Row 2 successfully inserted
Inserting row 3 failed with error 01000 [Vertica][VerticaDSII] (20) An error
occurred during query execution: Row rejected by server; see server log for
details (SQL-01000)
Row 4 successfully inserted
Inserting row 5 failed with error 01000 [Vertica][VerticaDSII] (20) An error
occurred during query execution: Row rejected by server; see server log for
details (SQL-01000)
Row 6 successfully inserted
Row 7 successfully inserted
Int  Float          VarChar        Date      Time      Timestamp     Bool
===  =====  ==================  ========== ======== ================ ====
  1  1.111  Matching datatypes  2001-01-01 01:01:01 2001-01-01 01:01 1
  2  2      Ints <-> floats     2002-02-02 02:02:02 2002-02-02 02:02 1
  4  4.4    Strings -> numbers  2004-04-04 04:04:04 2004-04-04 04:04 0
  6  6.6    3.14159             2006-06-06 06:06:06 2006-06-06 06:06 0
  7  7.7    Date/time formats   2007-07-07 07:07:07 2007-07-07 07:07 1