juce::parseXML takes a juce::String, but here you’re passing it a pointer to some raw binary data (hence the error message about creating a string from 8-bit data). You probably want to use juce::String::createStringFromData().
As for decoding the unicode characters, you’ll possibly need to use juce::CharPointer_UTF8 (or one of the other char pointer classes) to parse them correctly.
Well, createStringFromData removes assertion, but I still get crappy xml with unicodes.
And whatever I try to do, it does not encode properly, and I get the same crappy result:
How are you actually printing/viewing these string values? Are you using DBG, or std::cout, or drawing the strings into a Component, or something else?
I just tried encoding the xml file into binary data and then parsing it:
const auto xml = juce::XmlDocument::parse (juce::CharPointer_UTF8 (BinaryData::thefile_txt));
Then, I’m able to loop through the elements and print their strings with no issues:
for (const auto element : xml->getChildIterator())
DBG (element->getAllSubText());
I tried this on mac and windows, and it seemed to work correctly in both cases. Perhaps the method you are using to print the strings is not using the correct encoding.
I don’t think that’s true. The BinaryData generator just reads the bytes it is given and converts them to a C++ source file representing arrays of bytes. It’s up to the program reading bytes from those arrays to interpret and display the bytes using the correct encoding.
I was going to point out that the binary data generator doesn’t read the XML file’s encoding property so just changing that property won’t affect how the embedded data is generated.
Looked to me like the file you provided was already UTF-8 encoded, and I also had no trouble getting it to display properly in a label.
@ImJimmi@reuk Thank you guys. xml = juce::parseXML(juce::CharPointer_UTF8(BinaryData::localization_ru_xml));
works.
I think the key point was to use CharPointer_UTF8. I used it after xml was loaded.