InterProcessConnection - Corrupt Value Tree over Network - Bug?

Dear all,

I have a problem with the InterProcessConnection class and sending a large ValueTree (300kb) over the network, which results in corrupted data:

I have two applications, a server and a client and the client sends the project file (ValueTree) to the client:

void Server::sendFullProject()
{
    juce::MemoryOutputStream m;
    m.writeShort(XMCommand::requestFullSync);
    getProjectData().writeToStream(m);
    sendMessage(m.getMemoryBlock());
}

and the client receives the data and tries to rebuild the ValueTree:

void Client::messageReceived(const juce::MemoryBlock &message)
{  
    juce::MemoryInputStream data(message, false);
    ...
    juce::ValueTree projectData = juce::ValueTree::readFromStream(data);
}

When I run both applications on the same machine, everything works fine!
But when client and server run on different machines, the readFromStream function always hits asserts, that it is trying to read corrupt data - sometimes a name is empty or an compressed int suddenly has 30 bytes etc.

As far as I remember this used to work over network with no problem, but now in Juce 7.0.9 I encountered this problem.

Looking at the data sent, I found that the version over network adds a lot of zero bytes to it, maybe this shifts something and the reader gets confused?

(left side local version, right side remote version over network)

Hope someone has an idea or can point me in the right direction!

We did recently identify an issue in which sending large amounts of data may result in bytes being dropped.

As a first step could you pull the latest version of develop to see if these issues are resolved there? the relevant commit is…

If the above commit doesn’t resolve your issue, please let us know as there is also a way to increase the buffer size for a StreamingSocket instance which is what the InterprocessConnection and InterprocessConnectionServer classes use internally. We haven’t yet exposed those options on those classes but that’s maybe something we could look into if it turns out to be required.

Dear Anthony,

thanks for the quick reply!

I only had time for a quick check and the results are a bit weird, it worked once out of 10 tries.
So i started both apps 10x and on one connection it seemed to work and everything was fine, but in 9 other cases the problem remained.

Hopefully I will have more time to investigate later, but thanks again!

What sizes are we talking about here? Any insights or the possibility to add asserts?

Unfortunately I don’t think it’s as simple as saying size x. The report we got was using ~250 OSC messages, but only when the payload of the OSC messages was more than 32 bytes and this was using a UDP connection (InterProcessConnection uses TCP).

I didn’t work on this directly but as I understand it, it seems some internal OS buffers weren’t adequately sized so data was dropped (it was only reported and reproduced on macOS, Windows seemed to be fine). I strongly suspect it’s highly dependant on the connection too.

When investigating further we found that our implementation was setting the send and receive buffers in the sockets to a size of 65,536 bytes. Further testing revealed that the macOS default (on the versions we looked at) was actually closer to 700,000 bytes. As a result we now default to the maximum value between the OS default and 65,536. This ensures we’re not decreasing the default size for anyone accidentally but in most cases the default size of the buffers will increase.

We also added a SocketOptions struct for specifying the specific receive and send buffer sizes so that if anyone were to hit this in future one option would be to increase the buffer sizes, or at least specify a minimum buffer size (which might be appropriate here).

Unfortunately It seems we missed adding these options to the InterProcessConnection classes so right now it’s not possible. If it turns out that specifying these options is useful in this case we’ll certainly look to add those options to the interface ASAP.

As for assertions I’ll check in with the team but I suspect there isn’t an obvious place to assert.

Sorry for my late reply, I had other projects coming in.

Just letting you know, in JUCE 7.0.12 it seems to be fixed :>
thanks a lot!

1 Like

Sorry to bother again, but it looks like the issue is not fully fixed yet (using JUCE 8.0.0)

So when I connect my both machines with ethernet-cables it works fine. If I do it over WiFi (in the same network) the old problem with corrupted data comes back. But when I connect them over WiFi at my home router (using a windows machine as a server though), it works fine aswell, so it seems there are some “network conditions” (bandwidth?) that make this happen.

EDIT:

For now I circumvented the problem by splitting up the project into smaller (100k) pieces and send them one after one, which seem to work for now.

I have found a bug that may be related, and I found a curious pattern to it that might help solve the issue. So far, testing this on macOS, with both JUCE 7.0.8 and 8.0.5.

I am using the JUCE interprocess classes, though not over a network. I have a custom plugin scanner that is similar to the one found in AudioPluginHost to scan plugins out-of-process. So, my main app has a ChildProcessCoordinator subclass, and it launches a separate app with a ChildProcessWorker subclass.

Like in AudioPluginHost, the data I want to pass from the worker app to the main app is an XmlElement. So, on the ChildProcessWorker end, that XmlElement gets turned into a String, and then a MemoryBlock. That MemoryBlock is sent using ChildProcessWorker::sendMessageToCoordinator. Then in the main app, that MemoryBlock then gets turned back into a String, and from there parsed into the XMLElement.

This works great most of the time, however, there is one plugin file that when scanned results in a larger data set (the one file contains 21 different plugins). The MemoryBlock for that one is about 200kb in size. What happens is that the MemoryBlock::toString() call can fail, asserting from the CharPointer_UTF8::isValidString call in String::fromUTF8. For this one plugin, it fails in this way roughly half the time, and the rest of the time it works fine. Hmmm!

In order to figure out what was failing, I customized the isValidString check, and had it log which byte exactly it was failing on. One surprise is that the errant byte does not have a random value. Every time, it has a value of -81 as an signed char, or 0xAF as hex.

But here’s the curious pattern. I also had it log which byte number it was failing on. Here’s a data set over 5 runs:

Byte number 49145 of 199864 is not valid
Byte number 32761 of 199864 is not valid
Byte number 73721 of 199864 is not valid
Byte number 40953 of 199864 is not valid
Byte number 81913 of 199864 is not valid

If you add 7 to each of these numbers, you get a series of:
49152, 32768, 73728, 40960, 81920

And that series has a greatest common divisor of 8192.

In other words, the bad byte is always occurring right before a 8192 byte boundary!

This leads me to wonder if there’s some garbage data being left uninitialized in MemoryBlock, at the end of some intermediate buffer/chunk, when it’s dealing with larger sizes of data. (I do run the isValidString check on the data BEFORE the ChildProcessWorker sends it, and it always passes that. The issue is only found on the receiving end.)

@anthony-nicholls If this seems unrelated to the original topic here, please let me know and I can move it to its own thread.

1 Like