Best practices for using te::TransportControl::ReallocationInhibitor

Hey everyone, hope you’re all doing well!

I’m currently trying to work out the best practices when it comes to using the te::TransportControl::ReallocationInhibitor. \

I’m currently trying to tackle crashes which are occuring on the audio thread when the message thread is tearing down the plugin graph during playback. The crash will typically occur in some kind of Node::process type function, and the message thread will have just destoyed the Node that was playing.

The most common place this happens is when calling AudioClip::rescale, during playback, but can also happen when adjusting the master bus volume.

I’m currently using these functions in our engine play() and stop() respectively:

  /**
   * This function should be called whenever we want to declare that the audio engine is currently playing, and we want
   * to prevent tearing down and rebuilding the audio graph. This will prevent the `Node::process` flavour crashes.
   */
  void disableTransportAllocation() {
    if (!reallocationInhibitor) {
      reallocationInhibitor = std::make_unique<te::TransportControl::ReallocationInhibitor>(edit->getTransport());
    }
  }

  /**
   * This function should be called whenever the engine is stopping playback, and it is now safe to rebuild the plugin
   * graph.
   */
  void enableTransportAllocation() {
    // delete the reallocation inhibitor
    reallocationInhibitor.reset();
  }

  std::unique_ptr<te::TransportControl::ReallocationInhibitor> reallocationInhibitor;

TLDR; how do you use te::TransportControl::ReallocationInhibitor ?

3 Likes

Apologies for the delayed reply, I’ve been on holiday for the last two weeks.

This shouldn’t happen. You should effectively be able to make any edits and the audio graph should just work so it sounds like another bug that needs fixing.
If you can post a stack trace that would help me narrow down what it is.

Regarding the actual question, we tend to use ReallocationInhibitor when lots of changes are about to happen and we don’t want the effects to be audible until the end of the operation. The classic example of this is whilst dragging an audio/MIDI clip. You don’t really want to hear the audio skip around during the drag, only updating once you’ve lifted up the mouse.

Hey Dave, all good! I hope you had some good rest and relaxation! :smiley::palm_tree:

Here’s a bunch of our stack traces for a few crashes in tracktion code.
MaykCrashesForDave.zip (50.9 KB)

They tend to look a bit like this:

Crashed: AURemoteIO::IOThread
0  MaykAudioEngine                 0x378338 tracktion_engine::PluginNode::process(tracktion_graph::Node::ProcessContext&) + 624
1  MaykAudioEngine                 0x1d89e0 tracktion_graph::Node::process(juce::Range<long long>) + 468
2  MaykAudioEngine                 0x4a00c4 tracktion_graph::LockFreeMultiThreadedNodePlayer::process(tracktion_graph::Node::ProcessContext const&) + 420
3  MaykAudioEngine                 0x7363cc tracktion_engine::EditPlaybackContext::NodePlaybackContext::process(float**, int, int) + 356
4  MaykAudioEngine                 0x735e84 tracktion_engine::EditPlaybackContext::fillNextNodeBlock(float**, int, int) + 880
5  MaykAudioEngine                 0x7355d0 tracktion_engine::DeviceManager::audioDeviceIOCallbackInternal(float const**, int, float**, int, int) + 744
6  MaykAudioEngine                 0x733494 juce::AudioDeviceManager::audioDeviceIOCallbackInt(float const**, int, float**, int, int) + 372
7  MaykAudioEngine                 0x730b78 juce::iOSAudioIODevice::Pimpl::process(unsigned int*, AudioTimeStamp const*, unsigned int, AudioBufferList*) + 504
8  libEmbeddedSystemAUs.dylib      0xe6fc AUConverterBase::RenderBus(unsigned int&, AudioTimeStamp const&, unsigned int, unsigned int) + 792
9  libEmbeddedSystemAUs.dylib      0x96c08 AURemoteIO::RenderBus(unsigned int&, AudioTimeStamp const&, unsigned int, unsigned int) + 148
10 libEmbeddedSystemAUs.dylib      0xaf7d4 AUBase::DoRender(unsigned int&, AudioTimeStamp const&, unsigned int, unsigned int, AudioBufferList&) + 1484
11 libEmbeddedSystemAUs.dylib      0xa07e0 AURemoteIO::PerformIO(unsigned int, unsigned int, unsigned int, AudioTimeStamp const&, AudioTimeStamp const&, AudioBufferList const*, AudioBufferList*, int&) + 936
12 libEmbeddedSystemAUs.dylib      0xe8408 _XPerformIO + 324
13 libAudioToolboxUtility.dylib    0x1202c mshMIGPerform + 268
14 libAudioToolboxUtility.dylib    0x1244c MSHMIGDispatchMessage + 40
15 libEmbeddedSystemAUs.dylib      0x96640 void* caulk::thread_proxy<std::__1::tuple<caulk::thread::attributes, AURemoteIO::IOThread::IOThread(AURemoteIO&, caulk::thread::attributes const&, caulk::mach::os_workgroup const&)::'lambda'(), std::__1::tuple<> > >(void*) + 600
16 libsystem_pthread.dylib         0x1cb0 _pthread_start + 320
17 libsystem_pthread.dylib         0xa778 thread_start + 8

This is all typically on the audio thread. Where the message thread was useful, I’ve included that there too. Normally when you can catch the message thread in action it’s reallocating the graph because of some change.

Please let me know if I can provide any more helpful information for you!

Jamie

Before I dig in to this, can you test with the tip of develop please?
I’ve merged the changes that use the new LockFreeObject in LockFreeMultiThreadedNodePlayer which I think should fix this.

1 Like

Hi @dave96, we’ve just gone live with the lastest JUCE/Tracktion updates and a subset of users are still getting the crashes like I showed above. If you could investigate this then I would be super grateful! Is there anything I can do to help you with this?

Ok I’ll take another look.
Are you able to replicate this yourself? If so, can you give me a stack trace of when it happens now (it’s likely to be different to before) and make sure you include both the processing thread where the crash happens and the message thread so I can see what it’s doing at that time please?

Unfortunately, I don’t have a good repro! I’ve sometimes been able to hit this by making lots and lots of clips and constantly rescaling them (for about 5 mins just fuzzing the rescale slider)

They are all pretty similar to the last ones I posted, but I’ll get some fresh ones that happened today/yesterday so you can be fully up to date! :slight_smile:

Hey Dave here’s some fresh stack traces that we’re hitting in tracktion engine :slight_smile:
MaykCrashesForDave2.zip (32.7 KB)

I can’t see any message thread in those logs… Is the BackgroundEngineThread what’s being used as the message thread? If so, are you sure juce is configured to use it as the message thread?

My fear is that there are some things that should only happen on the message thread and not be concurrent with other Engine operations. However, if your “UI” messages are coming in on a different thread (the BackgroundEngineThread) to the juce MessageThread, these will race against each other and cause memory corruptions.

Could this be what’s happening? On what thread do timer callbacks happen in the Engine e.g. put a breakpoint in Edit::timerCallback() and see what thread it’s called on.

1 Like

@dave96 yes that’s right, sorry. I should have made that clear the BackgroundEngineThread is our message thread. I will check that out, does that mean that Edit::timeCallback() should be guarded by a TRACKTION_ASSERT_MESSAGE_THREAD?

Every feature I can try is only calls updates from the BackgroundEngineThread, it won’t trigger any of these asserts I’ve put in.

Is it possible we’re not instanciating the message thread in the right way?

No, because by definition juce timer callbacks will only be called from unblocked message threads.

What I’m asking is if you put a breakpoint in the timerCallback, when it’s hit what thread is the calling thread. Is it BackgroundEngineThread or something else?

Yup. BackgroundEngineThread.

The reason that we were using that pattern is because rendering was blocking the iOS UI thread which was preventing the rendering progress bar update. Currently experimenting with moving the message thread onto the main thread and that seems to work fine. Do you expect this instead is going to be more stable? Do you have any suggestions to fix the rendering progress update issue?

Thanks again :slight_smile:

Jamie

Ok, and when you make changes to the Engine model from your UI, what thread does that happen on. If that’s different from BackgroundEngineThread you will definitely get problems. Generally, the Engine message thread (i.e. the juce message thread) needs to be the same as the main app one.

If that’s not the case in your app I’m surprised you didn’t hit assertions but perhaps we only added those to places where it seemed like people would use them from background threads.


Your rendering issue sounds orthogonal, how are you doing the render as the idea is that this happens on a background thread leaving your message thread free to update progress bars etc. simply reading the progress from an atomic.

Ok great, thanks Dave we’ve solved the rendering issue on the main thread and are moving the message thread back to the main thread. We’ll roll this out this week and feed back on how that goes :slight_smile:

Ok great. Hopefully that fixed the problems you were seeing.

Hey Dave! It was great to see you at ADC this week! Your TSan recommendation was a great one since it’s turned up problems exactly where we’ve been hitting issues :tada:

It seems that there’s a race condition in the juce::CachedValue<bool> at te::Plugin::enabled. Do you think that there’s something that we might be doing wrong here?

Would be great to get your feedback on this one.

Best,

Jamie

Edit: looks like we might need to implement some kind of thread-safe reader from the cached value like isk mentioned in this thread? Thread Safe CachedValue<> for Reading on other threads

Yes, looks like that could be the problem. Would you mind adding this to the to of the Plugin header and wrapping the juce::CachedValue<boo> enabled like this juce::CachedValue<AtomicWrapper<bool>> enabled.

/**
    Wraps an atomic with an interface compatible with var so it can be used
    within CachedValues in a thread safe way.
    Optionally supply a Constrainer to limit the value in some way.
*/
template<typename Type, typename Constrainer = DummyConstrainer<Type>>
struct AtomicWrapper
{
    AtomicWrapper() = default;

    template<typename OtherType>
    AtomicWrapper (const OtherType& other)
    {
        value.store (Constrainer::constrain (other));
    }

    AtomicWrapper (const AtomicWrapper& other)
    {
        value.store (other.value);
    }

    AtomicWrapper& operator= (const AtomicWrapper& other) noexcept
    {
        value.store (other.value);
        return *this;
    }

    bool operator== (const AtomicWrapper& other) const noexcept
    {
        return value.load() == other.value.load();
    }

    bool operator!= (const AtomicWrapper& other) const noexcept
    {
        return value.load() != other.value.load();
    }

    operator var() const noexcept
    {
        return Constrainer::constrain (value.load());
    }

    operator Type() const noexcept
    {
        return Constrainer::constrain (value.load());
    }

    std::atomic<Type> value { Type() };
};

Then re-run your test and see if that helps. If it does, I’ll add that to the engine.

Hi again Dave. This pattern does seem to be resolving the TSan issues I said about before, but I’ve noticed another one which looks much more likely to be the cause of our crashes.

void TransportControl::freePlaybackContext()
{
    playbackContext.reset();  <= 'Write' happens here...
    clearPlayingFlags();
    transportState->playbackContextAllocation = std::max (0, transportState->playbackContextAllocation - 1);
}
  // In some of our code...
  if (auto epc = transport.getCurrentPlaybackContext()) <= An offending 'read' happends here...
    info.isPlaying = epc->isPlaying();                  <= I think our crash happens here.

Here’s the stack trace of the read:

I’m pretty sure that our crash is happening when our code asks if the context is nullptr and it says ‘no’, but is in the process of being deleted. Not sure the best approach here since this snippet is being used on the audio thread. I’ll try and resolve the rest of the TSan issues and I’ll send over a PR with what I have for your feedback :slight_smile:

Yes, that looks much more like it could be the issue. The EditPlaybackContext is not thread-safe.

You shouldn’t need to do if (auto epc = transport.getCurrentPlaybackContext()) to find out if the playhead is playing though, there’s PluginRenderContext::isPlaying for that.

If you’re getting more info from it, take a look at ExternalPlugin::PluginPlayHead to see how that works.

1 Like