Setting the locale and encoding for ODBC sessions
Vertica provides the following methods to set the locale and encoding for an ODBC session:
-
Specify the locale for all connections made using the DSN:
-
On Linux and other UNIX-like platforms: Creating an ODBC DSN for Linux
-
On Windows platforms, set the locale in the ODBC DSN configuration editor's Locale field on the Server Settings tab. See Creating an ODBC DSN for windows clients for detailed information.
-
-
Set the Locale connection parameter in the connection string in
SQLDriverConnect()
function. For example: -
Use
SQLSetConnectAttr()
to set the encoding and locale. In general, you should always set the encoding with this function as opposed to, for example, setting it in the DSN.-
Pass the
SQL_ATTR_VERTICA_LOCALE
constant and the ICU string as the attribute value. For example: -
Pass the
SQL_ATTR_AP_WCHAR_TYPE
constant and the encoding as the attribute value. For example:
-
Notes
-
Having the client system use a non-Unicode locale (such as setting
LANG=C
on Linux platforms) and using a Unicode locale for the connection to Vertica can result in errors such as "(10170) String data right truncation on data from data source." If data received from Vertica isn't in UTF-8 format. The driver allocates string memory based on the system's locale setting, and non-UTF-8 data can trigger an overrun. You can avoid these errors by always using a Unicode locale on the client system.If you specify a locale either in the connection string or in the DSN, the call to the connection function returns SQL_SUCCESS_WITH_INFO on a successful connection, with messages about the state of the locale.
-
ODBC applications can be in either ANSI or Unicode mode:
-
If Unicode, the encoding used by ODBC is UCS-2.
-
If ANSI, the data must be in single-byte ASCII, which is compatible with UTF-8 on the database server.
The ODBC driver converts UCS-2 to UTF-8 when passing to the Vertica server and converts data sent by the Vertica server from UTF-8 to UCS-2.
-
-
If the end-user application is not already in UCS-2, the application is responsible for converting the input data to UCS-2, or unexpected results could occur. For example:
-
On non-UCS-2 data passed to ODBC APIs, when it is interpreted as UCS-2, it could result in an invalid UCS-2 symbol being passed to the APIs, resulting in errors.
-
Or the symbol provided in the alternate encoding could be a valid UCS-2 symbol; in this case, incorrect data is inserted into the database.
ODBC applications should set the correct server session locale using
SQLSetConnectAttr
(if different from database-wide setting) in order to set the proper collation and string functions behavior on server. -
The following example code demonstrates setting the locale using both the connection string and with the SQLSetConnectAttr()
function.