ZipFile::Builder and Unicode filenames on Windows


#1

If ZipFile::Builder is used to zip up files with non-ASCII filenames, the resulting zip extracts with no problem on OS X (with all filenames intact) but produces corrupt filenames when extracted in Windows (I'm using WinRAR, but I suspect other extractors will exhibit the same problem).

It seems that OS X is happy to interpret the filenames as UTF-8 (since it's the native format) but Windows wants to treat it as ASCII.

The good news is that I've been able to fix this easily.  The PKWare zip format specification says that 'bit 11 of the general purpose bit flag is Language encoding flag (EFS).  If this bit is set, the filename and comment fields for this file MUST be encoded using UTF-8'.

So without further ado:

 

void writeFlagsAndSizes (OutputStream& target) const

{

target.writeShort (10);     

target.writeShort (1 << 11); //< THIS IS THE CHANGED LINE       

target.writeShort (compressionLevel > 0 ? (short) 8 : (short) 0);

writeTimeAndDate (target);          

target.writeInt ((int) checksum);        

target.writeInt (compressedSize);        

target.writeInt ((int) file.getSize());       

target.writeShort ((short) storedPathname.toUTF8().sizeInBytes() - 1);

target.writeShort (0);           

}

 

Note : Not having much luck with the code formatter - if I paste code from Visual Studio then it ends up with double/triple line feeds that I can't delete.

 


#2

Ah, excellent stuff, thanks!