Characters encoding bug?

Hello Jules,
I am using a VC2008 Express built version of the old Jucer, since I have just updated my JUCE checkout and discovered that the binaries are no longer distributed.
I have simply opened and then saved a number of .cpp files to have the Jucer automatically convert addButtonListener() calls to addListener(). But some files now contain incorrect characters in C++ comments.

Before:// la correlazione accordo->scale è incompatibile con quella scala->accordi
After:// la correlazione accordo->scale incompatibile con quella scala->accordi

It looks like an encoding problem…

I’m also unable to open a single .cpp file, previously generated with the Jucer, that it now refuses with a “This wasn’t a valid Jucer .cpp file…” message. Tracking the Jucer code with the debugger, it seems that the second parameter in the callint startLine = indexOfLineStartingWith (lines, T("BEGIN_JUCER_METADATA"), 0); at line #470 of jucer_JucerDocument.cpp gets translated to an empty string. So startLine becomes -1 and the entire loadDocumentFromFile() procedure fails.

The jucer expects the files to be in utf-8 - and since I’ve made the string classes more robust, I guess that the new version is rejecting any illegal characters that it finds, to make its input valid utf-8. Are you using the latest tip? (if not, I think the utf-8 parser in the 1.52 release would stop when it encountered an illegal character, but in the most recent check-ins, it ignores it and carries on)

OK, now I understand, I didn’t know about the UTF-8 requirement, I believe this is a recent change, isn’t it?
I’m using the latest tip.
It looks like the Jucer saves its files without BOM, why? My WinMerge app is having troubles with this…

Because BOMs aren’t portable - some compilers can’t build files that use them. It’s annoying, I know, but in C++ the only truly portable source files can only contain ascii characters below 127, and encode unicode string literals as utf-8 using escape sequences.