Gzip strings/valuetree issue


#1

I’m trying to zip and de-zip a string/valuetree.

// zipping ..
MemoryOutputStream mos;
GZIPCompressorOutputStream gcos (mos);
treeOriginal.writeToStream (gcos); // valuetree
gcos.flush();

// zipped string
auto resultZipped = String::createStringFromData (mos.getData(), mos.getDataSize());

// de-zipping 
auto treeDezipped = ValueTree::readFromGZIPData (resultZipped.toRawUTF8(), resultZipped.getNumBytesAsUTF8());

// this should not be triggered if the zipping and dezipping was done correct.
jassert (treeDezipped.toXmlString() == tree.toXmlString());

The assert is triggered, but I don’t know what I am doing wrong.

Many thanks for who can help me!


#2

Search the codebase for “GZIPTests” - we have a unit test that does this kind of thing which you could copy


#3

Thank you Jules, I just checked it out.
I think the problem with my code is the conversion from raw memory to String back to raw memory. The GZIP test is about zipping and dezipping raw memory with no Strings involved.


#4

I figured out a solution.
GetState: I created a MemoryBlock to which the MemoryOutputStream writes, then I use the MemoryBlock::toBase64Encoding() method to get a valid string from it.

SetState: I create a MemoryBlock. I fill the MemoryBlock using MemoryBlock::fromBase64Encoding() method and then I use ValueTree::readFromGZIPData to read from the MemoryBlock and set the state.


#5

It seems kind of silly to use a wasteful encoding like base64 if what you’re trying to do is to reduce the size of the data! Why not just put UTF8 into the memory block?


#6

Ok, so: mos.toUTF8() // MemoryOutputStream
But reading it back does not work:
juce::ValueTree::readFromGZIPData (state.toUTF8(), mis.getNumBytesAsUTF8()) // state = string that was generated by mos.toUTF8()


#7

Try passing it the pointer and size from the same object, not from two different things?


#8

Still not working
auto utf8 = state.toUTF8();
ValueTree::readFromGZIPData (utf8, utf8.sizeInBytes());

I didn’t mention this before but the state has to be saved into the systems clipboard.
https://docs.juce.com/master/classSystemClipboard.html#ab0efb785d53db6f2986950d591313ba5
For some reason, when I store the utf8 string in the clipboard, the clipboard becomes empty.


#9

Why would you be passing UTF8 to readFromGZIPData? UTF8 is a string, not gzipped data.


#10

This is what I try to do:
ValueTree -> zipping -> String (saved in clipboard, preset or host state) -> dezipping -> ValueTree.
The bold String is where something goes wrong.
The String can be an utf8 string generated from the MemoryOutputStream but how can I go back from the String to the valuetree?


#11

There is data in a binary file like the gzipped version, that you cannot represent in a String, also not in an UTF-8 string. Non printable characters will screw things up regardless.
Your Idea to put it in base64 encoding, so it only uses printable characters make sense in this respect, but like Jules said, base64 increases the size dramatically, probably more than you saved by zipping in the first place.

Clipboards in general support different mime types, but on JUCE only text is implemented (that’s why you can copy paste images from application to application, if both support that).

Maybe you need to see, if that can be extended?
Also have a read, where the size limit for clipboards is?

Good luck


#12

Thanks Daniel.
I tried to use base64 encoding without zipping but that resulted in a very large string (about 25 times longer then with zipping).
So, I decided to use both zipping and the base64 encoding.
Anyway, thanks for your help Daniel and Jules.


#13

The most obvious IMHO would be to use toXmlString(). Has some overhead to the binary version, but less than wrapping it into base64 encoding.

The overhead of base64 to binary is 33% according to wikipedia.


#14

You mean storing the string in a memory block without zipping instead of using the tree.writeToStream method as in my first post?


#15

No, a human readable version of a ValueTree is XML, which can safely be stored in a String, as opposed to the binary version.

ValueTree state ("foo");
{
    auto xml = std::make_unique<XmlElement> (state.createXml ());
    if (xml != nullptr)
        SystemClipboard::copyTextToClipboard (xml);
}
// ...
auto text = SystemClipboard::getTextFromClipboard();
ValueTree other;
{
    std::unique_ptr<XmlDocument> xml = std::move (XmlDocument::parse (text));
    if (xml != nullptr)
        other = ValueTree::fromXml (xml);
}

…all untested, but I hope you get the idea…


#16

I understand, but the state is about 500 lines long.
I don’t want to put that on the clipboard.


#17

I understand, but the clipboard is designed for text here. I just thought I mention the option to stay in the textual domain.
But it seems like you have to extend the clipboard to accept binary data or find another way to export and import your data than the clipboard.


#18

Yes thank your for your advice.
Using both zipping and base64 is the solution I am going for.