macOS Round Trip Latency

I’ve mentioned this a few times before but I’ve now managed to implement some code to greatly improve the latency reporting on macOS.

After testing the reported and actual latencies for many devices and systems, there are two problems with the latencies reported on macOS:

  1. The actual latency is way higher than the reported latencies from the device
  2. The actual latency changes every time a device property changes (e.g. buffer size or sample rate)

I’ve drafted up some code using the input/now/output timestamps provided by CoreAudio which you can see here: CoreAudio: Changed latency reporting to update during audio callbac… · Tracktion/JUCE@d6f747b · GitHub

This seems to get the latency reported to within ~1ms of the measured latency for most devices.

This approach has the slight downside that the input and output latencies will jitter a bit but in my experience only by a few samples and the total is always the same. As you usually use input + output latencies this doesn’t really matter.


What are the JUCE teams thoughts on this? Is this likely to make it in to JUCE or can they think of a better approach?

7 Likes

See this too:

Yeah, it’s definitely a problem. I don’t think just adding the buffer size improves things much though as that won’t take in to effect the changes in latency when changing settings etc.
Using the timestamps from the CoreAudio callbacks seems to be the only way to get anywhere near the actual measured latency.

Fyi, I’m the author of RTL Utility - so we’ve used that to make measurements and can confirm that adding the buffer size fixed things for the listed interfaces across a range of settings. I haven’t received feedback from my user base to contradict this (though I wouldn’t consider that to be conclusive!).

2 Likes

I would just like to second what andrewj is saying at that we also found that the main issue with latency reporting with coreaudio devices was that the buffer size was not included in the reported latency.

I’m not familiar with what these timestamps you are using represent, so I don’t feel like I can really comment on your approach and whether it is valid ATM, my concerns would be though:

  • The latency doesn’t actually change after configuring and starting the device so why does it need updating?
  • How do clients get notified of the change in latency?
  • Does it include/represent all the latency components, such as systemic device I/O latency?

An example where we found the current latency reporting quite inaccurate was if using the UAD-2 Apollo QUAD, which for input latency coreaudio reports 292 samples for kAudioStreamPropertyLatency, which corresponds to the systemic input latency of the device e.g AD conversion/DSP etc and is constant irrespective of sample rate or buffer size. Output latency reported (kAudioStreamPropertyLatency) for the device is always 22 samples. AFAIR, these values are from memory but they are reasonably close. The kAudioDevicePropertySafetyOffset for the device was ~52 samples for both input and output and it also didn’t vary based on sample rate or buffer size.

So if we assume based on what andrewj and I have observed and the latency should be based on these properties:

  1. kAudioDevicePropertyLatency
  2. kAudioDevicePropertySafetyOffset
  3. kAudioStreamPropertyLatency
  4. kAudioDevicePropertyBufferFrameSize

Then at 64 samples:

Currently reported latency:

  • Input : 292(1) + 52(2) = 344
  • Output: 22(1) + 52(2) = 74
  • Total = 418 samples

Using all of the above latency components:

  • Input: 292(1) + 52(2) + 0(3) + 64(4) = 408
  • Output: 22(1) + 52(2) + 0(3) + 64(4) = 138
  • Total = 546 samples

At low buffer sizes the difference between these two values may not be too bad/noticable, but at 1024 samples:

Currently reported latency

  • Input: 292(1) + 52(2) = 344
  • Output: 22(1) + 52(2) = 74
  • Total = 418 samples

Using all of the above latency components

  • Input: 292(1) + 52(2) + 0(3) + 1024(4) = 1368
  • Output: 22(1) + 52(2) + 0(3) + 1024(4) = 1098
  • Total = 2466 samples

Which is quite a discrepancy.

So in response to:

  1. The actual latency is way higher than the reported latencies from the device.

I think the devices are reporting their latency correctly(well…the devices I tried anyway). Perhaps there is/was a misunderstanding as to what kAudioDevicePropertyLatency represented when the latency calculation was implemented.

  1. The actual latency changes every time a device property changes (e.g. buffer size or sample rate)

Yes it typically should, but as you can see from my example and from what I’ve seen it doesn’t due to the current latency calculation for coreaudio not including all the components of the latency.

1 Like

Just running into this myself. I’ve been patching my JUCE fork based on timbyr’s post and I’ll just mention for anyone reading that kAudioStreamPropertyLatency should be queried against the device’s AudioStream(s) not the device itself. kAudioStreamPropertyLatency has the same value as kAudioDevicePropertyLatency so it will appear to work but you’ll just be getting the device latency again. You can find an impl for getting the correct value from andrewj’s post here:

1 Like

I just reread this part of my post above and realise I meant kAudioDevicePropertyLatency rather than kAudioStreamPropertyLatency in that section. I think I got it correct later in the examples though.

1 Like

I know this is an old topic but it’s not been addressed and I think has some resolutions as mentioned in this thread:

As @fr810 is back around for a bit and is the expert in all things timing, any chance that you could add the check for kAudioDevicePropertyLatency?

I think the idea is that the latency detector demo in the examples dir should return 0 (or very close to it) as the device should be capable of reporting its entire latency (and we shouldn’t need to run round-trip detectors to add any additional latency).

Thanks in advance :pray:

5 Likes

Yes please!

And whilst digging around in there, any such distinguished expert could perhaps consider this too?

So I tried to add up all the latencies together and the latency detector demo still doesn’t return 0 (or close to it) with this change. Am I doing something wrong:

int getLatencyFromDevice (AudioObjectPropertyScope scope) const
{
    auto bufferSizeProperty = static_cast<UInt32> (bufferSize);
    UInt32 size = sizeof (bufferSizeProperty);
    AudioObjectPropertyAddress pa;
    pa.mElement = juceAudioObjectPropertyElementMain;
    pa.mSelector = kAudioDevicePropertyBufferFrameSize;
    pa.mScope = kAudioObjectPropertyScopeWildcard;
    AudioObjectGetPropertyData (deviceID, &pa, 0, nullptr, &size, &bufferSizeProperty);
    
    UInt32 deviceLatency = 0;
    size = sizeof (deviceLatency);
    pa.mSelector = kAudioDevicePropertyLatency;
    pa.mScope = scope;
    AudioObjectGetPropertyData (deviceID, &pa, 0, nullptr, &size, &deviceLatency);

    UInt32 safetyOffset = 0;
    size = sizeof (safetyOffset);
    pa.mSelector = kAudioDevicePropertySafetyOffset;
    AudioObjectGetPropertyData (deviceID, &pa, 0, nullptr, &size, &safetyOffset);
    
    UInt32 streamLatency = 0;
    size = 0;
    pa.mSelector = kAudioDevicePropertyStreams;
    AudioObjectGetPropertyDataSize (deviceID, &pa, 0, nullptr, &size);
    
    if (size >= sizeof (AudioStreamID))
    {
        HeapBlock<AudioStreamID> streamIDs (size / sizeof (AudioStreamID));
        AudioObjectGetPropertyData (deviceID, &pa, 0, nullptr, &size, streamIDs);
        
        // get the latency of the first stream
        size = sizeof (deviceLatency);
        pa.mSelector = kAudioStreamPropertyLatency;
        AudioObjectGetPropertyData (streamIDs[0], &pa, 0, nullptr, &size, &streamLatency);
    }

    return (int) (deviceLatency + safetyOffset + bufferSizeProperty + streamLatency);
}

That looks right Fabian but see my note at the bottom of the post.

Here’s the code I’ve been using - it looks like it is equivalent:

    int getLatencyFromDevice (AudioObjectPropertyScope scope) const
    {
        UInt32 deviceLatency = 0;
        UInt32 size = sizeof (deviceLatency);
        AudioObjectPropertyAddress pa;
        pa.mElement = juceAudioObjectPropertyElementMain;
        pa.mSelector = kAudioDevicePropertyLatency;
        pa.mScope = scope;
        AudioObjectGetPropertyData (deviceID, &pa, 0, nullptr, &size, &deviceLatency);

        UInt32 safetyOffset = 0;
        size = sizeof (safetyOffset);
        pa.mSelector = kAudioDevicePropertySafetyOffset;
        AudioObjectGetPropertyData (deviceID, &pa, 0, nullptr, &size, &safetyOffset);

        // Query stream latency
        UInt32 streamLatency = 0;
        UInt32 numStreams;
        pa.mSelector = kAudioDevicePropertyStreams;
        if (OK(AudioObjectGetPropertyDataSize (deviceID, &pa, 0, nullptr, &numStreams)))
        {
            HeapBlock<AudioStreamID> streams (numStreams);
            size = sizeof (AudioStreamID*);
            if (numStreams > 0 && OK(AudioObjectGetPropertyData (deviceID, &pa, 0, nullptr, &size, streams)))
            {
                pa.mSelector = kAudioStreamPropertyLatency;
                size = sizeof (streamLatency);
                // We could check all streams for the device, but it only ever seems to return the stream latency on the first stream
                AudioObjectGetPropertyData (streams[0], &pa, 0, nullptr, &size, &streamLatency);
            }
        }
        
        return (int) (deviceLatency + safetyOffset + streamLatency) + getFrameSizeFromDevice();
    }

:warning: In-built microphone and speakers appear to introduce additional latency

In recent years the actual latency through in-built audio devices in macOS won’t match what is reported (at least on the machine I have access to). I suspect there might be some additional latency due to ambient noise reduction. I believe some Macs allow you to disable this in the sound control panel, however mine doesn’t so I can’t test it :frowning:

Importantly, latency reported for external audio devices does match measurements made with RTL Utility. I have tested Steinberg UR22 myself and have confirmation from users of other devices (notably from RME).

1 Like

I haven’t tested this in a while, so things may have changed on the JUCE side or with macOS…but AFAIR when using the in-built devices they are treated by JUCE as separate devices and the AudioIODeviceCombiner class is instanciated. This results in additional buffering and the round-trip latency being inaccurate because it isn’t taken into account. I do recall though that if you aggregate the inbuilt audio I/O via the OS it will be considered to be a single device by JUCE and the round-trip latency was pretty accurate (with similar modifications to JUCE as described above).

I think there is also a way to programatically request macOS to aggregate devices which if implemented may make the AudioIODeviceCombiner redundant in some (all?) scenarios but that is a separate issue I guess.

1 Like

Aha, thanks for pointing that out Tim! I just created an aggregate device and saw that the difference between reported and measured reduced from a couple of thousand samples to several hundred samples. So it helps but doesn’t entirely account for the difference.

Sound Radix repo been using this approach -

Yet, with newer devices we’ve saw Apple’s reporting incorrect results and have reported this to Apple. no update since then.

2 Likes

AFAIR when using the in-built devices they are treated by JUCE as separate devices and the AudioIODeviceCombiner class is instanciated. This results in additional buffering and the round-trip latency being inaccurate because it isn’t taken into account.

Yes, that’s exactly right. I did a bit more investigations. Unfortunately, for separate input/output hardware, the latency of the AudioIODeviceCombiner will not be constant [1]. This is because the AudioIODeviceCombiner does not look at the audio callback timestamps, without which the exact relative start times of the two devices cannot be calculated.

I’ll put up my above change for code review so that the latency is at least correct for when the AudioIODeviceCombiner is not involved and then discuss with the team if, fixing the AudioIODeviceCombiner once and for all, or if programmatically creating an Apple aggregate device is the right approach here.

[1] This can easily be reproduced by measuring the latency of the desired devices with the latency demo, switching to another audio device and back again and then re-measuring the latency. The latency changes each time.

Sounds like a good plan, might as well get it right for users with dedicated audio devices. Thanks Fabian!

I’ll put up my above change for code review so that the latency is at least correct for when the

I compared our changes to yours above and I think the only difference is some additional error checking on our side and perhaps some stylistic stuff.

If this code review is a public process, please point me at it and I’d be happy to give my feedback, otherwise what I think you’ve posted will address the main issue and is a welcome change. Thanks.

Can I double check if the recent changes here are intended to fix the latency reporting?

With those changes I’m still getting very large latencies detected via loopback even when using a dedicated device (i.e. not invoking the AudioIODeviceCombiner).
For example, using my Mackie Onyx Artist 1-2 the reported latency seems to be about twice the buffer size off.

Anybody else seeing the JUCE Latency Tester reporting nothing close to 0ms?

Hi @dave96. The actual fix for the latency reporting is still in code review. Sorry! The change you are referencing is just a code clean-up. But also see this commit which fixed a typo in the commit you referenced.

Ok thanks. Yes, I’ve got the tip, just wanted to link to the larger commit to show what I was talking about.

Is the fix large or is there any chance I could have a patch of it to test locally with the devices I have whilst you review it?