Hello, Jucers.
I have an interesting issue regarding UTF-8 where I feel that the behavior of Juce has changed, and is now incorrect.
It used to be that if CharPointer_UTF8::isValidString(p, size) returned false, you were guaranteed that String::fromUTF8(p, size) would fail - so when I convert from strings, (e.g. from files) I always check CharPointer_UTF8::isValidString()
I added a lot of translated text to my program, and a lot of it seems to fail CharPointer_UTF8::isValidString() - yet the string looks like a perfect good UTF-8 string. And now, if I ignore CharPointer_UTF8::isValidString(), then I get a correct string!
What’s going on here?
inline String str(const string& s) {
const char* p = s.c_str();
size_t size = s.size();
bool valid = CharPointer_UTF8::isValidString(p, size);
if (!valid) {
LOG(ERROR) << "Badly encoded string |" << s << "| " << s.size();
LOG(ERROR) << s[0] << ", " << s[1];
valid = true; // HACK - IGNORE THE FACT THAT THIS STRING IS BAD!
}
return valid ? String::fromUTF8(p, size) : "(badly encoded string)";
}
with result:
I0819 23:36:21.582591 2956513280 Instance.cpp:267] registered
E0819 23:36:21.731360 2696910144 Juce.h:101] Badly encoded string |Öffnen Sie den letzten| 23
E0819 23:36:21.731408 2696910144 Juce.h:102] \303, \226