Functions | |
size_t | charCount (const char *utf8str) |
Count the number of UTF-8 chars in the string. | |
UInt16 | UTF8toUCS2 (const char *utf8char) |
Convert one UTF-8 char (possibly multiple bytes) into a UCS2 16-bit char. | |
String | UCS2toUTF8 (UInt16 ucs2char) |
Convert one UCS2 16-bit char into a UTF-8 char (possibly multiple bytes). | |
UInt32 | UTF8toUCS4 (const char *utf8char) |
Convert one UTF-8 char (possibly multiple bytes) into a UCS4 32-bit char. | |
String | UCS4toUTF8 (UInt32 ucs4char) |
Convert one UCS4 32-bit char into a UTF-8 char (possibly multiple bytes). | |
void | UCS4toUTF8 (UInt32 ucs4char, StringBuffer &sb) |
Convert one UCS4 32-bit char into a UTF-8 char (possibly multiple bytes) This version is faster to use in a loop than the version which returns a String. | |
void | UCS4toUTF8 (UInt32 ucs4char, char *p) |
Array< UInt16 > | StringToUCS2Common (const String &input, bool throwException) |
Array< UInt16 > | StringToUCS2ReplaceInvalid (const String &input) |
Convert a UTF-8 (or ASCII) string into a UCS2 string. | |
Array< UInt16 > | StringToUCS2 (const String &input) |
Convert a UTF-8 (or ASCII) string into a UCS2 string. | |
String | UCS2ToString (const void *input, size_t inputLength) |
Convert a UCS2 string into a UTF-8 (or ASCII) string. | |
String | UCS2ToString (const Array< UInt16 > &input) |
Convert a UCS2 string into a UTF-8 (or ASCII) string. | |
String | UCS2ToString (const Array< char > &input) |
Convert a UCS2 string into a UTF-8 (or ASCII) string. | |
int | UTF8CharLen (UInt32 ucs4char) |
template<typename TransformT> | |
bool | transformInPlace (char *input, TransformT transformer) |
template<typename TransformT> | |
String | transform (const char *input, TransformT transformer) |
bool | toUpperCaseInPlace (char *input) |
Convert the UTF-8 string to upper case. | |
String | toUpperCase (const char *input) |
Convert the UTF-8 string to upper case and return the result. | |
bool | toLowerCaseInPlace (char *input) |
Convert the UTF-8 string to lower case. | |
String | toLowerCase (const char *input) |
Convert the UTF-8 string to lower case and return the result. | |
int | compareToIgnoreCase (const char *str1, const char *str2) |
Compares 2 UTF-8 strings, ignoring any case differences as defined by the Unicode spec CaseFolding.txt file. | |
Variables | |
UInt8 | SequenceLengthTable [256] |
const CaseMapping | lowerMappings [] |
const CaseMapping | upperMappings [] |
const CaseMapping *const | lowerMappingsEnd |
const CaseMapping *const | upperMappingsEnd |
|
Count the number of UTF-8 chars in the string. This may be different than the number of bytes (as would be returned by strlen()). If utf8str is not a valid UTF-8 string, then the result is undefined.
Definition at line 102 of file UTF8Utils.cpp. References BLOCXX_ASSERT. Referenced by BLOCXX_NAMESPACE::String::UTF8Length(). |
|
Compares 2 UTF-8 strings, ignoring any case differences as defined by the Unicode spec CaseFolding.txt file.
Definition at line 43 of file UTF8UtilscompareToIgnoreCase.cpp. Referenced by BLOCXX_NAMESPACE::String::compareToIgnoreCase(), and BLOCXX_NAMESPACE::String::endsWith(). |
|
Convert a UTF-8 (or ASCII) string into a UCS2 string.
Definition at line 371 of file UTF8Utils.cpp. References StringToUCS2Common(). |
|
Definition at line 263 of file UTF8Utils.cpp. References BLOCXX_ASSERT, BLOCXX_THROW, BLOCXX_NAMESPACE::String::c_str(), BLOCXX_NAMESPACE::String::length(), BLOCXX_NAMESPACE::Array< T >::push_back(), and SequenceLengthTable. Referenced by StringToUCS2(), and StringToUCS2ReplaceInvalid(). |
|
Convert a UTF-8 (or ASCII) string into a UCS2 string. Invalid characters will be changed to U+FFFD (the Unicode Replacement character)
Definition at line 365 of file UTF8Utils.cpp. References StringToUCS2Common(). |
|
Convert the UTF-8 string to lower case and return the result.
Definition at line 2082 of file UTF8Utils.cpp. References lowerMappings, lowerMappingsEnd, and transform(). Referenced by BLOCXX_NAMESPACE::String::toLowerCase(). |
|
Convert the UTF-8 string to lower case. The string is modified in place. If a character is encountered whose replacement occupies a greater number of bytes than the original, processing will cease and false will be returned. The current implementation does not handle any of the special cases as defined in the Unicode SpecialCasing.txt file, and thus characters will not grow, so currently false will never be returned.
Definition at line 2076 of file UTF8Utils.cpp. References lowerMappings, lowerMappingsEnd, and transformInPlace(). Referenced by BLOCXX_NAMESPACE::String::toLowerCase(). |
|
Convert the UTF-8 string to upper case and return the result.
Definition at line 2070 of file UTF8Utils.cpp. References transform(), upperMappings, and upperMappingsEnd. Referenced by BLOCXX_NAMESPACE::String::toUpperCase(). |
|
Convert the UTF-8 string to upper case. The string is modified in place. If a character is encountered whose replacement occupies a greater number of bytes than the original, processing will cease and false will be returned. The current implementation does not handle any of the special cases as defined in the Unicode SpecialCasing.txt file, and thus characters will not grow, so currently false will never be returned.
Definition at line 2064 of file UTF8Utils.cpp. References transformInPlace(), upperMappings, and upperMappingsEnd. Referenced by BLOCXX_NAMESPACE::String::toUpperCase(). |
|
Definition at line 467 of file UTF8Utils.cpp. References BLOCXX_NAMESPACE::StringBuffer::releaseString(), SequenceLengthTable, UCS4toUTF8(), and UTF8toUCS4(). Referenced by toLowerCase(), and toUpperCase(). |
|
Definition at line 427 of file UTF8Utils.cpp. References SequenceLengthTable, UCS4toUTF8(), UTF8CharLen(), and UTF8toUCS4(). Referenced by toLowerCaseInPlace(), and toUpperCaseInPlace(). |
|
Convert a UCS2 string into a UTF-8 (or ASCII) string.
Definition at line 397 of file UTF8Utils.cpp. References BLOCXX_NAMESPACE::Array< T >::size(), and UCS2ToString(). |
|
Convert a UCS2 string into a UTF-8 (or ASCII) string.
Definition at line 392 of file UTF8Utils.cpp. References BLOCXX_NAMESPACE::Array< T >::size(), and UCS2ToString(). |
|
Convert a UCS2 string into a UTF-8 (or ASCII) string.
Definition at line 377 of file UTF8Utils.cpp. References i, and UCS4toUTF8(). Referenced by UCS2ToString(). |
|
Convert one UCS2 16-bit char into a UTF-8 char (possibly multiple bytes).
Definition at line 133 of file UTF8Utils.cpp. References UCS4toUTF8(). Referenced by BLOCXX_NAMESPACE::Char16::toString(). |
|
Definition at line 235 of file UTF8Utils.cpp. |
|
Convert one UCS4 32-bit char into a UTF-8 char (possibly multiple bytes) This version is faster to use in a loop than the version which returns a String.
Definition at line 205 of file UTF8Utils.cpp. |
|
Convert one UCS4 32-bit char into a UTF-8 char (possibly multiple bytes).
Definition at line 197 of file UTF8Utils.cpp. References BLOCXX_NAMESPACE::StringBuffer::releaseString(). Referenced by transform(), transformInPlace(), UCS2ToString(), and UCS2toUTF8(). |
|
Definition at line 406 of file UTF8Utils.cpp. Referenced by transformInPlace(). |
|
Convert one UTF-8 char (possibly multiple bytes) into a UCS2 16-bit char.
Definition at line 120 of file UTF8Utils.cpp. References UTF8toUCS4(). Referenced by BLOCXX_NAMESPACE::Char16::Char16(). |
|
Convert one UTF-8 char (possibly multiple bytes) into a UCS4 32-bit char.
Definition at line 139 of file UTF8Utils.cpp. References BLOCXX_ASSERT, and SequenceLengthTable. Referenced by transform(), transformInPlace(), and UTF8toUCS2(). |
|
Definition at line 497 of file UTF8Utils.cpp. Referenced by toLowerCase(), and toLowerCaseInPlace(). |
|
Initial value: lowerMappings + (sizeof(lowerMappings)/sizeof(lowerMappings[0])) Definition at line 2022 of file UTF8Utils.cpp. Referenced by toLowerCase(), and toLowerCaseInPlace(). |
|
Initial value: { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 } Definition at line 82 of file UTF8Utils.cpp. Referenced by StringToUCS2Common(), transform(), transformInPlace(), and UTF8toUCS4(). |
|
Definition at line 1255 of file UTF8Utils.cpp. Referenced by toUpperCase(), and toUpperCaseInPlace(). |
|
Initial value: upperMappings + (sizeof(upperMappings)/sizeof(upperMappings[0])) Definition at line 2025 of file UTF8Utils.cpp. Referenced by toUpperCase(), and toUpperCaseInPlace(). |