You use batch inserts to insert chunks of data into the database. By breaking the data into batches, you can monitor the progress of the load by receiving information about any rejected rows after each batch is loaded. To perform a batch load through ODBC, you typically use a prepared statement with the parameters bound to arrays that contain the data to be loaded. For each batch, you load a new set of data into the arrays then execute the prepared statement.
When you perform a batch load, Vertica uses a COPY statement to load the data. Each additional batch you load uses the same COPY statement. The statement remains open until you end the transaction, close the cursor for the statement, or execute a non-INSERT statement.
Using a single COPY statement for multiple batches improves batch loading efficiency by:
-
reducing the overhead of inserting individual batches
-
combining individual batches into larger ROS containers
Note
If the database connection has AUTOCOMMIT enabled, then the transaction is automatically committed after each batch insert statement which closes the COPY statement. Leaving AUTOCOMMIT enabled makes your batch load much less efficient, and can cause added overhead in your database as all of the smaller loads are consolidated.Even though Vertica uses a single COPY statement to insert multiple batches within a transaction, you can locate which (if any) rows were rejected due to invalid row formats or data type issues after each batch is loaded. See Tracking load status (ODBC) for details.
Note
While you can find rejected rows during the batch load transaction, other types of errors (such as running out of disk space or a node shutdown that makes the database unsafe) are only reported when the COPY statement ends.Since the batch loads share a COPY statement, errors in one batch can cause earlier batches in the same transaction to be rolled back.
Batch insert steps
The steps your application needs to take in order to perform an ODBC Batch Insert are:
-
Connect to the database.
-
Disable autocommit for the connection.
-
Create a prepared statement that inserts the data you want to load.
-
Bind the parameters of the prepared statement to arrays that will contain the data you want to load.
-
Populate the arrays with the data for your batches.
-
Execute the prepared statement.
-
Optionally, check the results of the batch load to find rejected rows.
-
Repeat the previous three steps until all of the data you want to load is loaded.
-
Commit the transaction.
-
Optionally, check the results of the entire batch transaction.
The following example code demonstrates a simplified version of the above steps.
The result of running the above code is shown below.
Allocated an environment handle.
Set application to ODBC 3.
Allocated Database handle.
Connecting to database.
Connected to database.
Creating prepared statement
Created prepared statement.
Bound CustIDs array to prepared statement
Bound CustNames array to prepared statement
Bound phoneNums array to prepared statement
Adding Batch #1...successful.
Adding Batch #2...successful.
Adding Batch #3...successful.
Adding Batch #4...successful.
Adding Batch #5...successful.
Committing transaction
Committed transaction
Free handles.
The resulting table looks like this:
=> SELECT * FROM customers;
CustID | CustName | Phone_Number
--------+-------------+-----------------
100 | Allen, Anna | 1-617-555-1234
101 | Brown, Bill | 1-781-555-1212
102 | Chu, Cindy | 1-508-555-4321
103 | Dodd, Don | 1-617-555-4444
100 | Allen, Anna | 1-617-555-1234
101 | Brown, Bill | 1-781-555-1212
102 | Chu, Cindy | 1-508-555-4321
103 | Dodd, Don | 1-617-555-4444
100 | Allen, Anna | 1-617-555-1234
101 | Brown, Bill | 1-781-555-1212
102 | Chu, Cindy | 1-508-555-4321
103 | Dodd, Don | 1-617-555-4444
100 | Allen, Anna | 1-617-555-1234
101 | Brown, Bill | 1-781-555-1212
102 | Chu, Cindy | 1-508-555-4321
103 | Dodd, Don | 1-617-555-4444
100 | Allen, Anna | 1-617-555-1234
101 | Brown, Bill | 1-781-555-1212
102 | Chu, Cindy | 1-508-555-4321
103 | Dodd, Don | 1-617-555-4444
(20 rows)
Note
An input parameter bound with the SQL_C_NUMERIC data type uses the default numeric precision (37) and the default scale (0) instead of the precision and scale set by the SQL_NUMERIC_STRUCT input value. This behavior adheres to the ODBC standard. If you do not want to use the default precision and scale, useSQLSetDescField()
or SQLSetDescRec()
to change them in the statement's attributes.