prepareToPlay and processBlock thread-safety

So I’ve used the ThreadSanitizer (https://github.com/google/sanitizers) to find threading issues in my audio plugin, and noticed that it reports a data race for variables modified in prepareToPlay and read in processBlock.
This seems to be caused by prepareToPlay being called from a thread other than the audio thread.

So do I have to synchronize access to variables between prepareToPlay and processBlock, or do DAWs only call processBlock after prepareToPlay has returned? I assume yes, otherwise the ThreadSanitizer wouldn’t complain, I guess.

Also, is the thread that prepareToPlay is called on the Message Thread?

Unfortunately there are no guarantees here.

For “normal” use prepareToPlay is called on the message thread, then processBlock is called on the audio thread. If you are doing non-realtime rendering then both could be called on a different non-audio thread or the message thread.

Well behaved hosts won’t call both prepareToPlay and processBlock on different threads simultaneously, but there are a handful of badly behaved hosts out there. Usually when any of the JUCE team makes a statement like this we get a handful of people saying “I’ve developed plug-ins for X years and we’ve never seen a problem…” so if you’re not doing anything unusual then you’re unlikely to run into trouble. Edge cases do exist though:

1 Like

Thanks, so for now I’ll probably just look away from the race condition there and hope nothing breaks. I’m doing the same with the AudioProcessorValueTreeState thread-safety issues anyway ¯\_(ツ)_/¯

For the record.
I think I am seeing Nuendo (DOP and live) instances call prepareToPlay while procesBlock is running. So it calls it once, then audio processing starts, then it calls it again a few buffer later. So there’s a good chance I’m gonna set things in prepareToPlay while they are being used in processBlock().

prepareToPlay() is our only opportunity to set up filters, prepare them and reset() after setting the filter order (as is required). So this race condition is basically unavoidable unless we add a mutex/lock used by these 2 callbacks.

That’s working but we did just stick a lock in the audio callback. In 99.9% of cases prepareToPlay will never cause the audio thread to lock… and if it does it’s because Steinberg are being naughty. Right?

I’m trying to convince myself this is all OK.

Are you seeing those calls on separate threads (on the same instance)? The host is allowed to call prepareToPlay() between blocks, but calling them from different threads on the same instance would break/crash most plugins. I can’t imagine a popular host doing that.

prepareToPlay on thread: 0x106604580
processBlock on thread: 0x176b5f000
processBlock on thread: 0x2ad73b000
processBlock on thread: 0x2ad73b000
processBlock on thread: 0x2af177000
processBlock on thread: 0x2af177000
prepareToPlay on thread: 0x106604580
processBlock on thread: 0x2b2fdb000
processBlock on thread: 0x2b2fdb000
processBlock on thread: 0x2b37c3000

Can you provide a complete call stack of the second prepateToPlay call? And is this with VST3 or VST2?

We’re also had strange reports of one of our products mysteriously crashing in DOP under Nuendo.

This is with VST3. and the race is random (I was able to reproduce this only with release builds with rare reproducibility)

I think the only guarantee you should expect is, that prepare, process and release are never called concurrently. I also wonder what you are doing exactly so that it matters that prepareToPlay is sometimes called on an audio thread and sometimes called on an arbitrator background thread etc. As long as it doesn’t interfere with processBlock, you won’t have race conditions with your audio code.

… although apparently not even that is the case

Yes, sorry for the confusing text in my previous message.
My code (or any plugin code that I know of for that matter) doesn’t care which thread prepare/process are called in.

But it does rely on the fact they aren’t called at the same time because prepareToPlay() is traditionally resizing the same buffers used in processBlock(), and a host that tries to do those at the same time is to me fundamentally broken.

2 Likes

In my case it was Nuendo 12, Mac OS 12.5, Apple Silicon. VST3
I definitely had prepareToPlay called concurrently with processBlock, on different threads. After adding lock_guards the problem resolved.

I was getting a bad_access crash deep inside the filter class when processBlock() was using filters while (I presume) prepareToPlay() was messing with them. In prepareToPlay we set filter coefficients, prepare() and reset() the filters (among other things). It has to happen in prepareToPlay() as reset must be called on filters after changing filter order., and samplerate is not known until that time.

Your design is correct in my opinion, and similar to most commercial plugins I’ve been involved in.

However std::lock_guard involves a system call that could cause other issues, and I think you should avoid it. I would instead quite urgently report to Steinberg that their host is broken, and will probably crash most plugins out there.

As a temporary solution you might want to use something like a spinlock that doesn’t involve system calls. But TBH that “temporary solution” should probably be implemented by every plugin manufacturer so a preferable solution would be if the DAW is fixed, if indeed it’s buggy as you describe.

If Nuendo would make this call while a plug-in is processing then it would crash with many plug-ins, which it does not do. Maybe JUCE is calling prepareToPlay in a situation not initiated by the host? That’s why I ask for a call stack!

2 Likes

We had specific user getting those randomly. It got worsen with N13 and using Apple silicon which I suspect results different concurrency that makes it more extreme.

@AScheffler - we had only one report so far and it’s only with DOP. No real time.

DOP is less common to be used. And used for specific processes mostly by post production.
Add the fact it’s not always crashing and making it rare.

Since we only saw it on DOP, I only apply locking when isNonRealTime returns true.
It passed our QA and we’re gonna send it to our user to see if it makes any difference.

@ttg , I’m definitely seeing it in a realtime instance. But so far I haven’t seen more than 2 calls to prepareToPlay(). Once before processing, and again after a few buffers (varies)… but never after that.

Now I’m wondering if the declared latency value has a role in this. I’ll experiment tomorrow.

1 Like

OK: setLatencySamples() in preapreToPlay() is the cause.
With setLatency(): Host calls prepareToPlay(), in which latency is set, which causes the VST3 to restart in a ComponentRestarter::handleAsyncUpdate().
I’m guessing this causes prepareToPlay a second time while the audio side has already started callbacks.

I see here that componentRestarter originally disallowed restarting during prepareToPlay… but a fix was made to allow it.
I wonder if it was never noticed that this race condition was the reason it was originally disallowed.

But I’m not sure how else we are supposed to report latency. We must do it in samples, and the sample rate is not known until prepareToPlay…
Storing it and setting latency in the first audio callback can’t be safe/reliable… I’d think most hosts would want to know the latency value before delivering the first buffer.

FTR:
Here’s what happens in a DOP instance where it’s easy to get an overlap in the callbacks (though I saw it in a realtime instance as well). The timing varies massively.

prepareToPlay on thread: 0x106704580
prepareToPlay ENDS
processBlock on thread: 0x16e047000
prepareToPlay on thread: 0x106704580
processBlock ENDS
processBlock on thread: 0x16e047000
processBlock ENDS
processBlock on thread: 0x16e047000
processBlock ENDS

… Many more processBlock() calls

processBlock on thread: 0x16e047000
processBlock ENDS
prepareToPlay ENDS

Call stack for prepareToPlay is the same every time:

|#0|0x000000017cdd27d4 in MYPLUGINAudioProcessor::prepareToPlay(double, int) at /Users/Me/MYPLUGIN/Source/PluginProcessor.cpp:544|
|---|---|
|#1|0x000000017ccd54cc in juce::JuceVST3Component::preparePlugin(double, int, juce::JuceVST3Component::CallPrepareToPlay) at /Users/Me/MYPLUGIN/_SDKs/juce/modules/juce_audio_plugin_client/juce_audio_plugin_client_VST3.cpp:3750|
|#2|0x000000017ccc11c8 in juce::JuceVST3Component::setActive(unsigned char) at /Users/Me/MYPLUGIN/_SDKs/juce/modules/juce_audio_plugin_client/juce_audio_plugin_client_VST3.cpp:2593|
|#3|0x0000000101b89000 in ___lldb_unnamed_symbol94968$$Nuendo 12 ()|
1 Like

Good finding.

It was changed a few years ago because of some hosts,

It’s the chicken and the egg for those who want latency described by wall-clock.

I also assume it’s more severe with DOP since,
For traditional real-time plug-in, you set the latency once. later calls will be ignored…
if (latencySamples != newLatency)

It’s less of a problem. but with anything that re-creates the instance (DOP…) it’s easier to reproduce.

Yeah, but we do need to base it on wall-clock if we want it to sound the same every sample rate.

Even with a fixed-length latency (in samples), is it safe to setLatency in the pluginProcessor constructor? That would trigger a prepareToPlay BEFORE the normal prepareToPlay…

Yes, so thankfully there’s no feedback loop, but you still get the second prepareToPlay() which is the one that clashes with processBlock().

I think I’ll have to settle on a minimally invasive spin lock. Is that what you’re doing Tal?

You shouldn’t call a virtual function in the constructor anyway. In constructors virtual calls aren’t virtual, because the object isn’t fully constructed.

See isocpp.org: strange-inheritance#calling-virtuals-from-ctors

3 Likes

So far, we’ve provided our user with a lock only for nonRealTime where we saw the actual crash. (only in release builds btw due to timing I guess?)

Looking at the JUCE VST3 wrapper
We call the setLatency, that calls all listeners (still same thread) ending up on the VST wrapper audioProcessorChanged. So far all is legit. But then it calls async to:

    void restart (int32 newFlags)
    {
        if (newFlags == 0)
            return;

        flags.fetch_or (newFlags);

        if (MessageManager::getInstance()->isThisTheMessageThread())
            handleAsyncUpdate();
        else
            triggerAsyncUpdate();
    }

The VST3 SDK suggests this is correct:

Instructs host to restart the component. This must be called in the UI-Thread context!

I’d ask Steinberg about this. Because we need to use Vst::kLatencyChanged and it must be called from UI-Thread context. so where can we guarantee the processing. (unless we end up with locks on the calls :frowning: )