Using unicode in Juce


#1

When the flag “JUCE_STRINGS_ARE_UNICODE” is set, it means all string variables in app are now in unicode?

I got some problems here. I’m getting file infos(aritst, title, album…) from mp3 files, but can’t handle those “Far East Characters” correctly.

String a;
File file(T("d:/testHaro/夏夕空.mp3"));  //a mp3 with Far East Chars
	if (file.existsAsFile())
	{
		FileInputStream stream(file);
		int i = 0; //tag and title
		
		stream.setPosition(stream.getTotalLength() - 128 + i);
		a += stream.readEntireStreamAsString();
		
		i = 33; //artist start point
		stream.setPosition(stream.getTotalLength() - 128 + i);
		a += stream.readString() + " ";

		i = 63;  //album start point
		stream.setPosition(stream.getTotalLength() - 128 + i);
		a += stream.readEntireStreamAsString() + " ";

		i = 93;  //year start point
		stream.setPosition(stream.getTotalLength() - 128 + i);
		a += stream.readEntireStreamAsString() + " ";

		i = 97;  //comment start point
		stream.setPosition(stream.getTotalLength() - 128 + i);
		a += stream.readEntireStreamAsString() + " ";
}

[size=150]The result is "TAGϦ?Т= =O#/Ϧ? 2008 "[/size]

readEntireStreamAsString() is defined as:

This will read from the stream’s current position until the end-of-stream, and will try to make an educated guess about whether it’s unicode or an 8-bit encoding.

But I faile to get the correct characters.
Someone can help me.

Thanks in advance.


#2

i had a similar problem with encoding from and to CP1250, juce will not handle this you need to convert it manually, using ICONV or some sort of conversion table. Since i only had to deal with CP1250 i wrote a simple conversion table and it worked, however if your dealing with different encodings you’ll propably need iconv.


#3

Yes, they’ll all work fine internally. But what’s actually in the file you’re trying to read? Is it unicode, utf8 or some other code-page…? readEntireStreamAsString() isn’t very smart - if it doesn’t find unicode header byte markers, it’ll just pull it in as utf8.


#4

I want to read the artist, the album name etc…

Let’s say another case:

I want to get the file’s path from the commandLine when the app is intialized,

void initialise (const String& commandLine)
    {
        helloWorldWindow = new HelloWorldWindow();
		helloWorldWindow->cpt->d = commandLine;
    }

where d is another String,
the paths with CJK(chinese, japanese, korean) characters can’t be loaded.
What transform work should I do?


#5

Ok, but that’s a completely different matter - if it’s not correctly reading the command line args, I’d need to look at that, but which platform is it?


#6

Well,win32 xp


#7

The code deliberately get the command-line params as unicode - you can see this if you look in PlatformUtilities::getCurrentCommandLineParams().

I can’t really see how anything so simple could go wrong, but you could try stepping through that and see what happens in there. Perhaps see if GetCommandLineW() is actually returning the characters you expect?