CharacterFunctions::isLetterOrDigit loss unicode characters

VicFF · February 25, 2007, 9:58am

Hi!
Sorry for my English.
I tried use jucer with Russian texts of controls.
I got some problems.
At first saved files was not comleted.
I’ve add line:
setlocale(LC_ALL, “Russian_Russia.1251”);
on initialise of JucerApplication.

Then files saving comleted but all russian text in xml section encoded like

labelText=& #207;& #240;& #232;& #226;& #229;& #242;

I solved that by changing
bool CharacterFunctions::isLetterOrDigit (const char character)
{
return isalnum( (unsigned char)character)!=0;
/*
return (character >= ‘a’ && character <= ‘z’)
|| (character >= ‘A’ && character <= ‘Z’)
|| (character >= ‘0’ && character <= ‘9’);
*/
}

Is it right way?

jules · February 25, 2007, 12:28pm

Yes, I’ll look into the locales stuff soon, and see if I can do a better job. The thing with XML is that it’s saved as 8-bit, so it puts the escape sequences in there just to make sure it always loads correctly.

VicFF · February 25, 2007, 3:40pm

Actually, You need support encoding for XML, especially unicode encoding like “utf-8”.

<?xml version="1.0" encoding="utf-8" ?> Hello! ÐŸÑ€Ð¸Ð²ÐµÑ‚!

jules · February 25, 2007, 4:05pm

Yeah, I know. It’s been on the to-do-list for a while. The way it works at the moment means that the xml will always get correctly saved and re-loaded, but not ideal because it’s less readable.

ptomaine · February 25, 2007, 4:14pm

[quote=“VicFF”]Hi!
Sorry for my English.
I tried use jucer with Russian texts of controls.
I got some problems.
At first saved files was not comleted.
I’ve add line:
setlocale(LC_ALL, “Russian_Russia.1251”);
on initialise of JucerApplication.

Then files saving comleted but all russian text in xml section encoded like

labelText=& #207;& #240;& #232;& #226;& #229;& #242;

I solved that by changing
bool CharacterFunctions::isLetterOrDigit (const char character)
{
return isalnum( (unsigned char)character)!=0;
/*
return (character >= ‘a’ && character <= ‘z’)
|| (character >= ‘A’ && character <= ‘Z’)
|| (character >= ‘0’ && character <= ‘9’);
*/
}

Is it right way?[/quote]

Привет VicFF.
Вообще-то намного больше проблем с интернализацией в JUCE. Я уже давно говорю Джулиану об этих проблемах. Например, нельзя использовать работу со строками из-за того, что строки в JUCE неправильно воспринимают регистр букв (особенно много таких проблем в Линуксе, там вообще в TextComponent нельзя ввести русские буквы). С XML я разобрался другим способом. Я использую очень мощьную сишную библиотеку libxml2 и написал для неё свой небольшой C++ класс работы с нею. Особенно меня волновало то, что JUCE XML не поддерживает XPath (который я активно использую, очень удобно), а вот libxml2 поддерживает его на уровне качества Майкрософтовского. Так, что если хочешь, я могу выслать тебе классы работы с libxml2. По поводу интернационализации Джулиан уже в курсе проблем и я надеюсь это уже исправляет.

ptomaine · February 25, 2007, 4:28pm

That’s not really true. When I was working with my TreeView component I’ve encountered a big problem. The problem was my TreeView XmlElement items had Cyrillic tags. They had been encoded by JUCE but were wrongly decoded when have been being retrieved later then (memory → memory).

Topic		Replies	Views
Generated xml unreadable General JUCE discussion	6	500	May 12, 2017
Can't input non-English characters in TextEditor General JUCE discussion	16	2644	June 30, 2015
XML Special Characters General JUCE discussion	9	2905	June 9, 2021
Embedding unicode string literals in your cpp files General JUCE discussion	40	15184	November 4, 2022
Jucer - Code generation BUG The Projucer	4	475	May 12, 2017

CharacterFunctions::isLetterOrDigit loss unicode characters

Purchase

Discover

Learn

Support

About

Events

CharacterFunctions::isLetterOrDigit loss unicode characters

Related topics

Purchase

Discover

Learn

Support

About

Events