Issues about xml file with gb2312 encoding


#1

Hi,

When I use the class XmlDocument to parser the xml file which contains chinese words and using the gb2312 encoding, it always produces the error message “unmatched tags”,

also, the chinese words change to be irrecognizable. I feel very strange.

The part of xml file is:

<?xml version="1.0" encoding="gb2312"?>

窗口(window)

Cheers
Warren


#2

[quote=“xinhu”]Hi,

When I use the class XmlDocument to parser the xml file which contains chinese words and using the gb2312 encoding, it always produces the error message “unmatched tags”,

also, the chinese words change to be irrecognizable. I feel very strange.

The part of xml file is:

<?xml version="1.0" encoding="gb2312"?>

窗口(window)

Cheers
Warren[/quote]

I test in using the Markup to parser the same file, it works. I don’t think it the the file’s defect.


#3

Yes, the parser ignores the encoding, as the only file format it can read from is utf8.

Of course, if you can use some other mechanism to read the file into a (unicode) String, it’ll happily parse that.

TBH it should probably have asserted to let you know that it didn’t support it, I’ll add that to avoid confusion…


#4

Great, thanks jules.

It’s nice to support the encoding format in the juce, I hope juce will have such nice feture in the future. :slight_smile:


#5

No, I probably won’t add support for encodings. To do so would require at least a few days to code and test it, and would probably involve more code than the entire XML parser! Since nobody else has ever asked for it before, I really can’t justify the effort!