ISUTF8
Tests whether a string is a valid UTF-8 string.
Tests whether a string is a valid UTF-8 string. Returns true if the string conforms to UTF-8 standards, and false otherwise. This function is useful to test strings for UTF-8 compliance before passing them to one of the regular expression functions, such as REGEXP_LIKE, which expect UTF-8 characters by default.
ISUTF8 checks for invalid UTF8 byte sequences, according to UTF-8 rules:
-
invalid bytes
-
an unexpected continuation byte
-
a start byte not followed by enough continuation bytes
-
an Overload Encoding
The presence of an invalid UTF-8 byte sequence results in a return value of false.
To coerce a string to UTF-8, use MAKEUTF8.
Syntax
ISUTF8( string );
Arguments
string
- The string to test for UTF-8 compliance.
Examples
=> SELECT ISUTF8(E'\xC2\xBF'); -- UTF-8 INVERTED QUESTION MARK ISUTF8
--------
t
(1 row)
=> SELECT ISUTF8(E'\xC2\xC0'); -- UNDEFINED UTF-8 CHARACTER
ISUTF8
--------
f
(1 row)