Using COPY LOCAL to load data in Perl
You can use COPY LOCAL to load delimited files on your client system—for example, a file with comma-separated values—into Vertica. Rather than use Perl to read, parse, and then batch insert the file data, COPY LOCAL directly loads the file data from the local file system into Vertica. When execution completes, COPY LOCAL returns the number of rows that it successfully inserted.
Note
COPY LOCAL must be the first statement in a query,otherwise Vertica returns an error.The following example uses COPY LOCAL to load into Vertica local file data.txt
, which is located in the same directory as the Perl file.
#!/usr/bin/perl
use strict;
use DBI;
# Filesystem path handling module
use File::Spec;
# Create a hash reference that holds a hash of parameters for the
# connection.
my $attr = {AutoCommit => 0}; # Turn off AutoCommit
# Open a connection using a DSN. Supply the username and password.
my $dbh = DBI->connect("dbi:ODBC:VerticaDSN","ExampleUser","password123",
$attr) or die "Failed to connect: $DBI::errstr";
print "Connected!\n";
# Drop any existing table.
$dbh->do("DROP TABLE IF EXISTS Customers CASCADE;");
# Create a table to hold data.
$dbh->do("CREATE TABLE Customers( \
ID INT, \
FirstName VARCHAR(100),\
LastName VARCHAR(100),\
Email VARCHAR(100),\
Birthday DATE)");
# Find the absolute path to the data file located in the current working
# directory and named data.txt
my $currDir = File::Spec->rel2abs(File::Spec->curdir());
my $dataFile = File::Spec->catfile($currDir, 'data.txt');
print "Loading file $dataFile\n";
# Load local file using copy local. Return value is the # of rows affected
# which equates to the number of rows inserted.
my $rows = $dbh->do("COPY Customers FROM LOCAL '$dataFile' DIRECT")
or die $dbh->errstr;
print "Copied $rows rows into database.\n";
$dbh->commit();
# Prepare a query to get the first 15 rows of the results
my $sth = $dbh->prepare("SELECT * FROM Customers WHERE ID < 15 \
ORDER BY ID");
$sth->execute() or die "Error querying table: " . $dbh->errstr;
my @row; # Pre-declare variable to hold result row used in format statement.
# Use Perl formats to pretty print the output. Declare the heading for the
# form.
format STDOUT_TOP =
ID First Last EMail Birthday
== ===== ==== ===== ========
.
# The Perl write statement will output a formatted line with values from the
# @row array. See http://perldoc.perl.org/perlform.html for details.
format STDOUT =
@> @<<<<<<<<<<<<< @<<<<<<<<<<< @<<<<<<<<<<<<<<<<<<<<<<<<<<< @<<<<<<<<<
@row
.
# Loop through result rows while we have them
while (@row = $sth->fetchrow_array()) {
write; # Format command does the work of extracting the columns from
# the @row array and writing them out to STDOUT.
}
# Call commit to prevent Perl from complaining about uncommitted transactions
# when disconnecting
$dbh->commit();
$dbh->disconnect();
data.txt
is a text file with a row of data on each line. The columns are delimited by pipe (|
) characters. This is the default COPY delimiter for command accepts, which simplifies the COPY LOCAL statement.
Here is an example of the file content:
1|Georgia|Gomez|Rhiannon@magna.us|1937-10-03
2|Abdul|Alexander|Kathleen@ipsum.gov|1941-03-10
3|Nigel|Contreras|Tanner@et.com|1955-06-01
4|Gray|Holt|Thomas@Integer.us|1945-12-06
5|Candace|Bullock|Scott@vitae.gov|1932-05-27
6|Matthew|Dotson|Keith@Cras.com|1956-09-30
7|Haviva|Hopper|Morgan@porttitor.edu|1975-05-10
8|Stewart|Sweeney|Rhonda@lectus.us|2003-06-20
9|Allen|Rogers|Alexander@enim.gov|2006-06-17
10|Trevor|Dillon|Eagan@id.org|1988-11-27
11|Leroy|Ashley|Carter@turpis.edu|1958-07-25
12|Elmo|Malone|Carla@enim.edu|1978-08-29
13|Laurel|Ball|Zelenia@Integer.us|1989-09-20
14|Zeus|Phillips|Branden@blandit.gov|1996-08-08
15|Alexis|Mclean|Flavia@Suspendisse.org|2008-01-07
The example code produces the following output when run on a large sample file:
Connected!
Loading file /home/dbadmin/Perl/data.txt
Copied 1000000 rows into database.
ID First Last EMail Birthday
== ===== ==== ===== ========
1 Georgia Gomez Rhiannon@magna.us 1937-10-03
2 Abdul Alexander Kathleen@ipsum.gov 1941-03-10
3 Nigel Contreras Tanner@et.com 1955-06-01
4 Gray Holt Thomas@Integer.us 1945-12-06
5 Candace Bullock Scott@vitae.gov 1932-05-27
6 Matthew Dotson Keith@Cras.com 1956-09-30
7 Haviva Hopper Morgan@porttitor.edu 1975-05-10
8 Stewart Sweeney Rhonda@lectus.us 2003-06-20
9 Allen Rogers Alexander@enim.gov 2006-06-17
10 Trevor Dillon Eagan@id.org 1988-11-27
11 Leroy Ashley Carter@turpis.edu 1958-07-25
12 Elmo Malone Carla@enim.edu 1978-08-29
13 Laurel Ball Zelenia@Integer.us 1989-09-20
14 Zeus Phillips Branden@blandit.gov 1996-08-08