iOS Block Size bug + `static` vs `dynamic` cast

Hey JUCE team! I have two questions about call/interrupt behaviour in the iOS code.

So we’re using tracktion on iOS and we get pretty weird bad crashes when receiving calls and involving backgrounding.

  1. We sometimes get some crashes in this function, or more often further down in tracktion code. I’m wondering if this should be dynamically cast since it’s trying to infer a type? I know dynamic cast is more expensive, but would this offer a more robust check?

  2. Also, and I think more to the point, when we have crashes it’s often because the block size that iOS/JUCE prepares Tracktion with is smaller than the buffer it then goes on to pass it. It seems to prepare it with 283 samples, but then goes on to pass it 1024.

Stack trace of this incident here: - Google Drive

static void dispatchAudioUnitPropertyChange (void* data, AudioUnit unit, AudioUnitPropertyID propertyID,
                                             AudioUnitScope scope, AudioUnitElement element)
    // static_cast<Pimpl*> (data)->handleAudioUnitPropertyChange (unit, propertyID, scope, element); // before
    dynamic_cast<Pimpl*> (data)->handleAudioUnitPropertyChange (unit, propertyID, scope, element); // after

I don’t think there’s much value in that. The first argument to an AudioUnitPropertyListenerProc is documented to be the same as the pointer provided during property listener registration (AudioUnitAddPropertyListener). It looks like whenever we register listeners, we always pass a pointer to a Pimpl instance. I’m also not sure whether dynamic_cast would help - it’s supposed to aid casting between types that are related by inheritance, but a void isn’t related by inheritance to the Pimpl.

The crashes may be because there’s a missing AudioUnitRemovePropertyListener... call somewhere, which causes the Pimpl instance to be read after it is freed. Testing with Address Sanitizer might help you to work out whether this is the case.

Unfortunately, I think this is a long-standing issue in certain iOS versions. Here you can see one technique for working around this problem, from the StandalonePluginHolder. Perhaps you could use a similar technique to limit block sizes before they reach Tracktion.

@dave96 do you think tracktion should have built-in guards against this kind of iOS bug, as opposed to throwing an assert if this is a common thing? I’m 99% think this is the source of most of our tracktion crashes (at least, crashes that tracktion catches!). With our new refactor I’ve now got a 100% repro of this crash when hanging up calls. @reuk ASan doesn’t show up anything interesting when doing this.

We already do that:

I’m not sure how you could get buffer sizes that are larger than prepared for with that in place?

@dave got a repro again now. 100%.

Made a short video with the stack trace + values. I don’t think Node::initialise is being called in the right way?
Looks like Node::audioBlockSize is not being updated?

Right… but see my above snipped in DeviceManager, what are numSamples and maxBlockSize.
You need to go higher up the stack to the DeviceManager to see if that handling of larger-than-prepared-block-sizes is working correctly.
If not, that might be where the problem is.

Or is the larger block size being introduced somewhere else in the call stack? That could be the other cause.

Do you have a way to reproduce this with the simulator and one of the demos? I.e. some exact steps would be the only way I can fix this, otherwise I’m just guessing.

Yeah in that snippet they are the same, we enter in the usual case.

Right, so if you follow the call stack up, at what point is something asking for the 1024 number of samples. This will be where the bug is.

But you don’t get that far in your video and I don’t have a repo case.

I’ve been thinking about this. As far as I can tell there are three possible scenarios. You’ll need to debug to tell which one is correct:

  1. The DeviceManager code is correct but there’s a bug further down the player/node stack that is getting the buffer size wrong and calling process with a buffer too large
  2. The prepared buffer size is “small”, but iOS occasionally calls a “larger” buffer and the code in te::DeviceManager isn’t correctly chunking this
  3. There’s a problem with the release/prepare calls not correctly creating a graph with the correct size. To check this I’d log calls to te::DeviceManager prepare/release to ensure they’re mapped and have the buffer sizes you’d expect. You can then check the Node’s have been prepared with the size passed to prepare

Without having a repo case I can’t really dig in to this any more. You’d need to pinpoint it a bit more given the above.

Hey Dave this is great info, thanks so much :slight_smile: will dig and get back to you :slight_smile:

@dave96 this seems to solve the issue for my repro. Seems like rebuidlWaveDevices helps the block sizes get updated in the right way. Is this a viable fix for us to put into prod, or should we still dig for more answers? Maybe only do this if the broadcaster is the deviceManager? Maybe this gives a hint as to a root cause?

diff --git a/modules/tracktion_engine/playback/tracktion_DeviceManager.cpp b/modules/tracktion_engine/playback/tracktion_DeviceManager.cpp
index b64e85e..82c8fe4 100644
--- a/modules/tracktion_engine/playback/tracktion_DeviceManager.cpp
+++ b/modules/tracktion_engine/playback/tracktion_DeviceManager.cpp
@@ -418,14 +418,13 @@ void DeviceManager::changeListenerCallback (ChangeBroadcaster*)

-    if (! rebuildWaveDeviceListIfNeeded())
-    {
-        // force all plugins to be restarted, to cope with changes in rate + buffer size
-        const juce::ScopedLock sl (contextLock);
+    rebuildWaveDeviceList();

-        for (auto c : activeContexts)
-            c->edit.restartPlayback();
-    }
+    // force all plugins to be restarted, to cope with changes in rate + buffer size
+    const juce::ScopedLock sl (contextLock);
+    for (auto c : activeContexts)
+        c->edit.restartPlayback();

I’m not sure that’s the right area… That change listener callback will be async so could be out of sync with the playback engine. I would have thought that the code in DeviceManager::audioDeviceAboutToStart would correctly take care of that and must be called if the sample rate or block size changes?

Looking at it, I’m not sure there should be anything in DeviceManager::changeListenerCallback. If you remove all the code in there does the problem go away?

Hmmmm. It does seem to crash there a lot less, but I can still hit it. For whatever reason the fix I have hasn’t crashed in my testing (yet!). But if I remove all this code, I can hit the crash when I start recording and then make an outgoing call (not sure if connected to the cause). With the patch I showed above no crashes yet!

I don’t understand the cause of the issue enough to know why your fix improves things. It will certainly come with a performance hit and logically as I’ve suggested, I think the code in there should be removed entirely…

I’m guessing that when you make a call the device changes? Maybe to some mono input/output and the AudioDeviceManager isn’t getting stopped/started synchronously?

yeah that sounds pretty plausible! Our changes should be going out this week so should have some decent stats by Monday :slight_smile: