MAKEUTF8
Coerces a string to UTF-8 by removing or replacing non-UTF-8 characters.
Coerces a string to UTF-8 by removing or replacing non-UTF-8 characters.
MAKEUTF8 flags invalid UTF-8 characters byte by byte. For example, the byte sequence 0xE0 0x7F 0x80
is an invalid three-byte UTF-8 sequence, but the middle byte, 0x7F
, is a valid one-byte UTF-8 character. In this example, 0x7F
is preserved and the other two bytes are removed or replaced.
Syntax
MAKEUTF8( string-expression [USING PARAMETERS param=value] );
Arguments
string-expression
- The string expression to evaluate for non-UTF-8 characters
Parameters
replacement_string
- Specifies the VARCHAR(16) string that MAKEUTF8 uses to replace each non-UTF-8 character that it finds in
string-expression
. If this parameter is omitted, non-UTF-8 characters are removed. For example, the following SQL specifies to replace all non-UTF characters in thename
column with the string^
:
=> SELECT MAKEUTF8(name USING PARAMETERS replacement_string='^') FROM people;