If you are receiving blocks of 512 samples at 48k (48 samples per 1 ms), then each callback should happen 10.66667 ms apart. (512 / 48 =10.66667)
So I was timing the processBlock() calls on the two DAWS I have access to at the moment on the Mac: Digital Performer and Reaper - and also the JUCE AudioPluginHost.
Of course there is some tiny fluctuation between callbacks, so I am accumulating and averaging the times.
In Digital Performer, it indeed averages to 10.66667 and stays there.
In Reaper, it seems to average to 10.6659 (slightly less than expected) such that running the app for awhile (i.e. an hour) shows that it has drifted from the wall clock by 200 ms or so. This is the same with JUCE APH.
In the JUCE APH, you can select a buffer size of 480. At 48k, this should result in a time between callbacks of exactly 10.0 ms. But it shows up as 9.9992 ms as shown here in my debug printing:
Yes, I am monitoring it for any change in the buffersize, it is staying at 512 samples consistently, same as for my tests with the JUCE AudioPluginHost.
That sounds like asking for trouble anyway. The DAW can schedule callbacks as it sees fit, there is no real reason why there should be a fixed amount of time between them (except when recording for obvious reasons). The only hard constraint it has is that callbacks should finish before they are supposed to be sent to the DAC.
That may be true, but it doesnât explain why, when itâs just sitting there generating a steady stream of same-size buffers, that the timing doesnât add up.
The âwall clockâ (motherboard clock) and the soundcard clock are physically two different circuits. Each with their own crystals. Itâs normal for them to drift slightly. Another example is MIDI input, its timestamps can also drift relative to the others.
Itâs the DAW softwareâs job to periodically estimate the drift and make allowances for it.
Do you use the same audio device in all cases? It could be that DP uses a different clock source, which would imply that it has to do some ASRC in order to output on the device that drifts. This would also happen for things like aggregate devices or when input and output device are not the same. In these cases you always have to have some ASRC thingy at work, otherwise there would be glitches when clocks drift apart.
Generally for audio applications, I would always treat the audio sample rate as my âwall clockâ for things like sequencing, LFOs and so on. Never rely on now() for anything but benchmarking/debugging.
It has to be understood, and its quite important: every DAW has its own strategy for how to call processBlock - is there a dedicated thread for the callback, or is there a thread pool for the callback? Are threads re-used? Are they set up for realtime, or are ânormalâ threads being used to service another realtime audio thread.
The service time - at a threading level - for these different scenarios can definitely introduce timing discrepancies - compounded, as @JeffMcClintock has aluded, by divergent clocks according to the usersâ hardware configuration, which adds yet another level of cyclomatic complexity to the servicing of processBlock callbacks.
This is why its super important to test your projects on as many DAWâs as possible, and accommodate for the different architecture decisions made, first by the DAW developers, second by the hardware manufacturers, and third by the user deciding how to wire everything up in their environment. (Oh, and of course the OS vendors, who are a whole different class of architects in each case.)
Do yourself a huge favour, and workbench this: put a breakpoint on prepareToPlay, processBlock, timerCallback (if you have one) and, lets say for grins, setStateInformation. When you hit these breakpoints, look at the state of threads in the breakpointâed context. Repeat this test for every single DAW you are testing on. Doing this, you will come to understand, hopefully, the herculean task that the JUCE devs have accomplished in getting this all, relatively speaking, as tight as possible âŠ
Just to be clear that this should be true on average but there is absolutely no reason the callbacks have to be evenly spaced, in many cases they wonât be (try Widows with multiple driver types for example).
Honestly this is the most surprising thing, it suggests to me that DP is using the wall clock rather than the device. Iâm sure there is a good reason for it, but Iâm surprised.
This is exactly the kind of thing I would expect, it would be very surprising for two clocks to agree so precisely. Itâs probably not helped by measuring such short intervals as that may make any errors in the measurements more pronounced. But I definitely wouldnât expect them to agree.
No too clocks will ever exactly agree itâs just a matter of how much they disagree. You just have to pick one and rely on that one only!
While thatâs true, it should be noted no matter what strategies a DAW uses, in the long run - assuming somewhere the results are played back in real time - the timing has to converge to the rate of the receiving end, be it an actual device with a crystal, or some ASRC that interfaces between two unsynced rates. This is what weâre seeing here: when averaging for some time, itâll converge to a stable rate, no matter any jitter introduced by whatever scheduling magic.
Just so I understand your points completely, can you clarify the following?
Which two clocks are you referring to?
Which short intervals are you referring to? The time between callbacks? To be clear, I was averaging them. The yellow rectangle shows the actual time between each callback, the red rectangle is the average, the âdiffâ is the micros//nanos away from the expected rate of 10.000. (This is JUCE APH 480 buffer size @ 48k):
In any case, it would seem that relying on the wall clock for anything other than debugging would be a mistake?
I think we can agree that 512 samples @ 48k represents a chunk of audio 10.66667 ms in length, right? Thatâs an absolute, right? So if you are doing calculations in the block and basing your assumed time reference off of the number of samples processed, youâre going to be correctâŠ?
it might be a numerical issue with using averaging to estimate the time period from noisy data.
Using a median calculation might be more reliable.
here is an example which shows large âspikesâ in the timing information which drag up the average in a misleading manner. The green graph is the running median, which is much more stable and representative of the reality.
OK, Iâm willing to check it out. Iâm looking at your example but not quite getting it. Letâs assume my data is not as noisy as your example. So given my simple averaging by accumulating the time between processblock calls, something like this (not exactly so excuse any errors), how would you add/calculate the median?
// header:
int averageCount = 0;
double averageSum;
double prevTopTime;
// in processBlock()
if (averageCount == 0) // starting over
{
averageSum = 0.0;
prevTopTime = Time::getMillisecondCounterHiRes();
}
else
{
auto nowTopTime = Time::getMillisecondCounterHiRes();
auto elapsed = nowTopTime - prevTopTime;
prevTopTime = nowTopTime;
averageSum += elapsed;
auto averageMs = averageSum / averageCount; // the average ms to be printed
}
averageCount++;
So it does mean that for a 1h playback, Reaper and JUCE APH will take 1h + 0.2s to play it (a tiny pitch shift ). Interesting. I havenât used Time::getMillisecondCounterHiRes() though.
I think in this case we are probably talking about a clock for the CPU and a clock for an Audio device. That being said there might be more than one clock in your computer so it might depend what API you use.
The issue I was thinking of was a small fixed error introduced every time a measurement is taken. Iâve absolutely no idea how big this error might be and so it may be having no noticeable difference on your results. But I think a larger buffer size would help reduce that sort of error.
Imagine for example you could use a buffer size of 4800 samples but still be looking at the average time for 480 samples to pass. That I think would reduce the fixed error by 10x.
Yes, I think so.
Well 10 point 6 recurring, but by whose clock? You canât say which one is actually correct, you can only say they disagree on how long 10 point 6 recurring ms is. 512 samples @ 48k will on average be processed every 10 point 6 recurring ms according to one particular clock,
It would be no different if you had two tape measures and they disagreed about how long a metre is. One may well be more right than the other but in order to determine that, you need a standard reference to refer to.
Assuming you want to be aligned with the audio signal this is mostly fair to say. However, be warned that itâs only true on average. Imagine for example some driver has an internal buffer size of 512 samples and you ask for a buffer size of 128 samples. In this case you would very likely see 4 calls to your process block, each made one immediately after the other, followed by a big gap before the next 4 calls. I think if you are using a class compliant device via CoreAudio youâll see this kind of thing much less, I think Apple are doing the heavy lifting here.
Come to think it that might be what DP is trying to do, ensure there are regularly spaced callbacks independent from the actual audio device callbacks.
For a plugin if you call getPlayHead()->getPosition()->getHostTimeNs() from the process callback I think if itâs available you can get information about the time of the callback which I believe should be based on the clock from the audio device. However Iâve never actually used it myself.
Basing time off the samples does also have the advantage that it should be correct even if your plugin runs at non-real time, i.e. when a bounce down or an export is occurring.
Maybe, I think in this case we would expect the average to be correct though? unless there are actually any overruns.
Yes and no, every clock (and therefore every audio device) will be ever so slightly off from perfect. I think we can sayâŠ
This means for a 1h playback (as measured by the clock in the Audio device), the CPU clock will measure 1h + 0.288s.
I think itâs true to say if you play back audio using any two different clocks there will always be a small pitch shift between them if they donât share the same clock source.
Just a thought but I think if you measured enough callbacks from enough separate audio devices using the same clock to measure them (the CPU clock for example), preferably all at the same time, then you could probably get an idea of how accurate that clock is. There are a few variables to this but I think that would be true.
this is my gut feeling also. Small rounding errors caused by time periods being truncated to an integer by the system high resolution timer functions. But I canât prove it.
If so, I still donât understand why two different DAWs report different results? Why does one DAW report the exact number that is expected? (10.666667).