Calling ZipFile::createStreamForEntry() multiple times


#1

I’m using a the ZipFile with a MemoryInputStream providing ZIP data. When I call createStreamForEntry() for some index the first time, it works correctly. When I call it again a second time, it doesn’t work.

ZipInputStream re-uses the ZipFile's inputStream (which is documented in the API for createStreamForEntry()).

But the stream isn’t reset when calling createStreamForEntry(), so you can’t call it multiple times for one entry. Would it be possible to have it reset?


#2

What do you mean by “doesn’t work” - is createStreamorEntry() returning nullptr? Can you reproduce it with a simple test app?


#3

Oh I should have been more clear. createStreamForEntry(int) returns a valid InputStream instance, but when I call readEntireStreamAsString(), it returns an empty string. But it only returns an empty string when createStreamForEntry(int) wasn’t called for the first time with that entry index.

When I call setPosition(0) on the original MemoryInputStream (that I passed to the ZipFile constructor) every time before calling createStreamForEntry(int), it works.

I’ve tried to create an example, but I’m afraid I haven’t yet found a way to reproduce it with different data…It seems to depend on the contents of the ZIP file.


#4

I might have misunderstood, but you are talking about calling the readEntireStreamAsString() twice on the same stream, like:

ZipFile zip (file);
std::unique_ptr<InputStream> stream (zip.createStreamFromEntry (0));
auto content = stream->readEntireStreamAsString();
auto contentCopy = stream->readEntireStreamAsString();

In which case it would be expected behaviour. The InputStream has a read position, that stays at the end. You will have to rewind manually, since the API cannot know, if you want to read from start or continue a partial read call from before…

But I have the feeling, that’s not what you meant…


#5

No, I’m calling readEntireStreamAsString only once after every call to createStreamForEntry (see code below).

The stream I’m referring to is the one I used to initialize a ZipFile object. I’m using the ZipFile constructor that takes an InputStream*. If I don’t reset that InputStream by hand in-between calls to createStreamForEntry, I get the unexpected behaviour (an empty string from readEntireStreamAsString).

I call it this way:

MemoryBlock block(BinaryData::Foo_zip, BinaryData::Foo_zipSize);
auto zipFile = make_unique<ZipFile>(new MemoryInputStream(block, false), true);


// And then in a different scope..this all succeeds:
for (int i = 0; i < zipFile->getNumEntries(); ++i) {
    // Create stream
    const unique_ptr<InputStream> xmlStream(zipFile->createStreamForEntry(i));
    jassert(xmlStream);

    // Parse XML
    const unique_ptr<XmlElement> xml(XmlDocument::parse(xmlStream->readEntireStreamAsString()));
    jassert(xml);

    jassert(checkSomething(zipFile->getEntry(i)->filename));
}


// Later on, different scope:
for (int i = 0; i < zipFile->getNumEntries(); ++i) {
    if (checkSomething(zipFile->getEntry(i)->filename)) {
        // Create stream
        const unique_ptr<InputStream> xmlStream(zipFile->createStreamForEntry(i));

        // This succeeds:
        jassert(xmlStream);

        const auto xmlString = xmlStream->readEntireStreamAsString();

        // This fails:
        jassert(xmlString.isNotEmpty());
    }
}

As mentioned earlier, whether it fails seems to depend on the ZIP file’s contents. Thank you for taking the time to look into this. I appreciate it!


#6

So I’ve got this:

ZipFile zip (new MemoryInputStream (BinaryData::ZIPTests_zip, BinaryData::ZIPTests_zipSize, false), true);

for (int j = 0; j < 100; ++j)
{
    for (int i = 0; i < zip.getNumEntries(); ++i)
    {
        std::unique_ptr<InputStream> stream (zip.createStreamForEntry (i));
        
        if (stream.get() != nullptr)
            DBG (stream->readEntireStreamAsString());
    }
}

Where ZIPTests.zip is just 5 .txt files:

ZIPTests.zip (2.6 KB)

This works as expected and the output is consistent, there are some empty strings printed out from readEntireStreamAsString() but they are hidden files from macOS.

It’s a pretty trivial example though and perhaps your issue depends on the complexity of the .zip. Are you able to send me over a .zip file which reliably reproduces your issue? Feel free to PM me if you don’t want to post it publicly.


#7

Thanks again for looking into this…but unfortunately I can’t seem to get a ZIP file again which reproduces this :confused: it was occuring randomly, and only for some entries in the ZipFile.