I tried writing an xmlDocument into a zip file, but it crashes, any ideas?


#1

Hi!

writing an xmlDocument uncompressed works. I want to create a zip file with the document though, but that doesn’t work for me. Does the below code make sense or have I misunderstood something?

ScopedPointer<XmlElement> root = new XmlElement("Model");
(...)

// This works:
//bool success = root->writeToFile(*destination, String::empty);
//DBG("Model XML: " + String(success));

juce::String doc = root->createDocument(String::empty);
MemoryBlock mb(doc.toUTF8(), doc.length());
MemoryInputStream memstream(mb, false);
juce::ZipFile::Builder zipFile;
zipFile.addEntry(&memstream, 7, destination->getFileName(), Time::getCurrentTime());
juce::FileOutputStream stream(*destination);
        
// This crashes, something about: Invalid address specified to RtlValidateHeap( 025B0000, 00E0F320 )
zipFile.writeToStream(stream, nullptr);

// This works too...
//stream.write(mb.getData(), mb.getSize());

Thanks!


#2

I don’t know, if that’s the only problem, but UTF8 length is not the same as it’s byte count, because some characters are encoded as two chars:

Use String::getNumBytesAsUTF8() instead.


#3

Read about streamToRead!

    /** Adds a stream to the list of items which will be added to the archive.

        @param streamToRead this stream isn't read immediately - a pointer to the stream is
                            stored, then used later when the writeToStream() method is called, and
                            deleted by the Builder object when no longer needed, so be very careful
                            about its lifetime and the lifetime of any objects on which it depends!
                            This must not be null.
        @param compressionLevel     this can be between 0 (no compression), and 9 (maximum compression).
        @param storedPathName       the partial pathname that will be stored for this file
        @param fileModificationTime the timestamp that will be stored as the last modification time
                                    of this entry
    */
    void addEntry (InputStream* streamToRead, int compressionLevel,
                   const String& storedPathName, Time fileModificationTime);

It also looks like you’re using a pointer to a File. That’s not your problem in this case, but you should never heap-allocate objects like String, File, etc - doing so is worse in every way than using them as by-value objects!


#4

Jules, indeed, a bit of RTFM on addEntry is all that was needed!

Allocating MemoryInputStream on the heap was all it tool.

Thank you!


#5

…And Daniel, thank you for that catch! While it was working with my current data, there’s nothing to say that a user wouldn’t end characters that might have broken my code…


#6

I’m trying to add some values to an existing zip file. From what I’ve gathered, the best way to do this is to read the existing zip file, extract the elements inside into memory, then repack all of the elements + the element I want to add, and then replace the existing zipfile on disk.

I’m not sure if I’m doing anything wrong, but I’m getting a EXC_BAD_ACCESS inside ZipFile::Builder::Item::writeSource( OutputStream& target ), which is called when I finally try to write the file to disk:

juce::File z = folder.getChildFile(userEmail + ".db");
if( !z.exists() || z.getSize() == 0 ) {
    z.create();
}
ScopedPointer <FileInputStream> fz(z.createInputStream());
if( fz ) {
    ZipFile zip(fz, false);

    //unzip each of the existing midi files into a temp vector.  we'll add them back later
    struct UnzippedMidiFile {
        juce::String fileName;
        juce::MemoryBlock mb;
    };
    std::vector<UnzippedMidiFile> existingMidiFiles = {};
    for( int i = 0; i < zip.getNumEntries(); ++i ) {
        UnzippedMidiFile umf;
        umf.fileName = zip.getEntry(i)->filename; //use this with ZipFile::Builder::addEntry( InputStream, compressionLevel, storedPathName, fileModificationTime );
        ScopedPointer<InputStream> is = zip.createStreamForEntry(i);
        MemoryOutputStream mos(umf.mb, false);
        mos.writeFromInputStream(*is, zip.getEntry(i)->uncompressedSize);
        //while i'm here, I should probably make sure the umf.mos is still a valid midi file
        {
            MidiFile mf;
            mf.readFrom(*is);
            jassert(mf.getNumTracks()==1);
        }
        existingMidiFiles.push_back(umf);
    }

so far so good

    //create a new midi file from the data on the server
    MidiFile mf;
    {
        MemoryBlock mb;
        mb.loadFromHexString(payload.payload.midiData);
        MemoryInputStream mis(mb, false);
        mf.readFrom(mis);
        jassert(mf.getNumTracks()==1);
    }
    //turn it into an UnzippedMidiFile object
    UnzippedMidiFile curFile;
    curFile.fileName = sessionID;
    MemoryOutputStream mos(curFile.mb, false);
    mf.writeTo(mos);

    //store it with the existing midi files
    existingMidiFiles.push_back(curFile);

so far so good…

    //dump all of the existingMidiFiles back into a zip file
    ZipFile::Builder zipBuilder;
    for( int i = 0; i < existingMidiFiles.size(); ++i ) {
        ScopedPointer<MemoryInputStream> mis = new MemoryInputStream(existingMidiFiles[i].mb, false);
        zipBuilder.addEntry(mis, 9, existingMidiFiles[i].fileName, Time());
    }
    //write the zipfile to disk, replacing the old zipFile
    z.deleteFile();
    z.create();
    
    ScopedPointer<FileOutputStream> fos( z.createOutputStream() );
    double progress;
    zipBuilder.writeToStream(*fos, &progress); //this crashes, with a EXC_BAD_ACCESS

any ideas??? I’m stumped!!


#7
/** Adds a stream to the list of items which will be added to the archive.

    @param streamToRead this stream isn't read immediately - a pointer to the stream is
                        stored, then used later when the writeToStream() method is called, and
                        deleted by the Builder object when no longer needed, so be very careful
                        about its lifetime and the lifetime of any objects on which it depends!
                        This must not be null.
    @param compressionLevel     this can be between 0 (no compression), and 9 (maximum compression).
    @param storedPathName       the partial pathname that will be stored for this file
    @param fileModificationTime the timestamp that will be stored as the last modification time
                                of this entry
*/
void addEntry (InputStream* streamToRead, int compressionLevel,
               const String& storedPathName, Time fileModificationTime);

Note the lifetime of the stream you give it.


#8

I took care of that by making this change:

ZipFile::Builder zipBuilder;
for( int i = 0; i < existingMidiFiles.size(); ++i ) {
    MemoryInputStream* mis = new MemoryInputStream(existingMidiFiles[i].mb, false);
    zipBuilder.addEntry(mis, 9, existingMidiFiles[i].fileName, Time());
}

Sidebar, that whole MemoryInputStream* mis = new MemoryInputStream(existingMidiFiles[i].mb, false); zipBuilder.addEntry(mis, 9, existingMidiFiles[i].fileName, Time()); is really counter-intuitive to the whole JUCE mantra of avoiding the new keyword unless you’re using a ScopedPointer to maintain good RAII practices.

I no longer get the crash, but now my saved file doesn’t change size after the writeToStream call:

ScopedPointer<FileOutputStream> fos( z.createOutputStream() );
double progress;
zipBuilder.writeToStream(*fos, &progress);

When i stepped thru that writeToStream() with the debugger, the zipBuilder object had entries with valid sizes in it, so I know everything up until the writeToStream() is correct now. So, what am I doing wrong when I writeToStream()??


#9

Now that we’re finally all on C++11 and can assume everyone has move semantics, it’ll be possible for us to use ScopedPointers as function arguments to make it clear what’s going on, and which would have avoided you falling into this pitfall. But couldn’t have done that before, hence the need to RTFM for functions that take pointers.

God knows what else is wrong, but are you sure you’re not just appending onto the old file, as you’re not deleting it first?


#10

I did delete it first:

juce::File z = folder.getChildFile("userFile.zip");
if( !z.exists() || z.getSize() == 0 ) {
    z.create();
}
// a bunch of unzipping/copying the zip file contents into a ZipBuilder object goes here:
....

//write the zipfile to disk, replacing the old zipFile
z.deleteFile();
z.create();
ScopedPointer<FileOutputStream> fos( z.createOutputStream() );
double progress;
zipBuilder.writeToStream(*fos, &progress);

#11

ok, this worked:

MemoryOutputStream zippedMos;
zipBuilder.writeToStream(zippedMos, nullptr);
z.replaceWithData(zippedMos.getData(), zippedMos.getDataSize() );

instead of using zipBuilder.writeToStream() to try to write directly to the file, I wrote to a MemoryOutputStream object first, and then copied the MemoryOutputStream to the file 2nd.


#12

ok, here is the solution for appending to an existing Zip File, in 5 easy(lol) steps. Hopefully someone else finds this useful:

//1. open the zipFile:
File z = folder.getChildFile("someZipFile.extension");
ScopedPointer <FileInputStream> fz(z.createInputStream());
if( fz ) {
    ZipFile zip(fz, false);
    struct UnzippedFile {
        String fileName;
        MemoryBlock mb;
        juce::Time fileTime;
    };
    std::vector<UnzippedFile> existingFiles = {};
    //2. extract the zipped files into memoryBlocks.  they'll be recompressed and readded later
    for( int i = 0; i < zip.getNumEntries(); ++i ) {
         UnzippedFile umf;
         {
              umf.fileName = zip.getEntry(i)->filename; //use this with ZipFile::Builder::addEntry( InputStream, compressionLevel, storedPathName, fileModificationTime );
              umf.fileTime = zip.getEntry(i)->fileTime;
              ScopedPointer<InputStream> is = zip.createStreamForEntry(i);
              MemoryOutputStream mos(umf.mb, false);
              mos.writeFromInputStream(*is, zip.getEntry(i)->uncompressedSize);
         }
         existingFiles.push_back(umf);
    }
    //3. now add the new file
    UnzippedFile curFile;
    {
        curFile.fileName = theFileYoureAddingToTheArchive.getFileName();
        curFile.fileTime = Time::getCurrentTime();
        MemoryOutputStream mos(curFile.mb, false);
        theFileYoureAddingToTheArchive.writeTo(mos);
        existingFiles.push_back(curFile);
    }
    //4. dump all of the existingMidiFiles back into a zip file
    ZipFile::Builder zipBuilder;
    for( int i = 0; i < existingFiles.size(); ++i ) {
        zipBuilder.addEntry(new MemoryInputStream(existingFiles[i].mb, false),
                            9,
                            existingFiles[i].fileName,
                            existingFiles[i].fileTime);
    }
    //5. write the zipfile to disk, replacing the old zipFile
    z.deleteFile();
    z.create();
    MemoryOutputStream zippedMos;
    zipBuilder.writeToStream(zippedMos, nullptr);
    z.replaceWithData(zippedMos.getData(), zippedMos.getDataSize() );
}

#13

Why have the vector of intermediate objects? You could just add all the items directly to the zip builder and lose half this code, couldn’t you?


#14

er, Over on that Google style guide (https://google.github.io/styleguide/cppguide.html) they say to

Optimize for the reader, not the writer

I’m not as smart as you, so this is what I came up with that worked and is pretty easy to understand for laymen like me.

But by all means, use this as a teaching moment and show the more optimized solution!!


#15

I completely agree about optimising for the reader, but in general the same things that make code “good” also make it more readable - i.e. decomposing it into the right functions, giving things the right names, avoiding excessive clutter or repetition.

ok… here’s how I’d approach it. I’ve not tried it, but it looks pretty simple:

void addZipEntry (const File& zipFile, const File& fileToAdd)
{
    TemporaryFile temp (zipFile);

    {
        ZipFile zip (zipFile);
        ZipFile::Builder builder;

        for (int i = 0; i < zip.getNumEntries(); ++i)
        {
            auto& entry = *zip.getEntry (i);
            builder.addEntry (zip.createStreamForEntry (entry), 9, entry.filename, entry.fileTime);
        }

        builder.addFile (fileToAdd, 9);

        {
            FileOutputStream fo (temp.getFile());
            builder.writeToStream (fo, nullptr);
        }
    }

    temp.overwriteTargetFileWithTemporary();
}

#16

…BTW the problems you were having were probably to do with the ZipFile keeping an input stream open on the original file, which would have made it impossible to overwrite. If you pay attention to minimising your variable scope then that kind of thing becomes more obvious and avoidable.


#17

This would be a nice addition to the ZipFile class


#18

No, I disagree - I think this is far too specific a use-case to justify being in the class itself. And it doesn’t need to be in the class, you can do it with a free function perfectly well like this.


#19

The ZipFile class seems incomplete without the ability to append to existing zipFiles. The File class has methods for appending data to an existing file. This is simply an extension of that concept applied to Zip Files…