Interval Timing in the Audio Thread

I am looking to do some analysis of how long my audio callback takes to run. I would like to measure this through use of std::chrono::high_resolution_clock. I understand this would technically be a system call which is usually not acceptable in a real-time thread, although here it should be accessing an OS dedicated high res clock. The alternative I assume is to use OS specific calls such as QueryPerformanceCounter on Windows however, this obviously is much more work than using a universal system. Just wondering what other people’s opinions are on the real-time safety of these functions and if there are any alternative methods of achieving this?

I think you might find it more productive to integrate melatonin_perfetto in your project, enable it for your Debug builds, and just forget about any other tricky business you might do to convince yourself about your apps performance:

And while you’re at it, don’t forget to profile your GUI code as well with melatonin_inspector:

Both of these are easy to add to any JUCE project, very easy to configure so that they’re available when you need them (and not available in your Release builds), and will provide you with a lot of insight into your plugin/apps performance meanwhile ..

Wow, that looks really useful. Thanks! I will definitely take a look into that for profiling our performance.

We’re not just looking at this for debugging and performance profiling though. Our application has multiple real time audio threads since it can take multiple audio input types. It’d be really useful for a few reasons to be able to time these audio input callbacks. We’d be looking at these timings affecting the processing of the application, hence having to do them in both debug and release builds.

Understood. Well, you can either use the Analytics class directly, or if you just want the facts, juce::Time::getMillisecondCounterHires() might be a portable way to do it… don’t forget to factor in juce::Time::getHighResolutionTicksPerSecond() into your analysis, too ..

1 Like

There’s nothing wrong with using a high resolution clock for the purposes of profiling your processBlock. I wouldn’t enable it for production builds but for diagnostic purposes you’ll get just fine results.

When people say you shouldn’t use system calls in real-time code, they’re talking about avoiding things like priority inversion or disk flushes or wait states. But for profiling your code just to see how long it takes, you’ll be just fine.

Btw, VTune is free these days. With the IIT API you can define tasks (like processBlock) to see for example how long the calls are, how much that duration varies, how it interacts with other threads, etc. Perfetto looks good too, just throwing the VTune option out there since I’ve been using it for decades and love it.

Just checking, you are talking about this, correct? GitHub - intel/ittapi: IntelĀ® Instrumentation and Tracing Technology (ITT) and Just-In-Time (JIT) API

If I’m not mistaken, this is only available for Intel and I think it’s fair to assume, most people would like a solution that is at least also compatible with Apple Silicon systems.

@caustik - I for one would love to hear more about your experiences with using VTune. I don’t think anyone is attempting to argue with you - more that, based on the knowledge shared so far in this thread, that performance profiling is a very finicky subject, and that optimizations for one platform don’t necessarily carry over to another. VTune seems pretty sweet, I’ve encountered it over the years, but the load to set it up has never been viable for my projects - perfetto works just fine, as do liberally applied timer calls.

Also, I did mean ā€œenable for Debugging but not Releaseā€, but you are of course correct that one should profile Release builds rigorously too. I think there is a gradient scale of attention to detail needed - there is no ā€˜one solution fits all cases’ scenario, imho - early on in algorithm development, liberal Timer calls placed here and there can provide you with necessary insight, but then again once you get to cross-platform/multiple-target builds, having Perfetto at hand to easily measure things can show issues with architecture quite well, also.

1 Like

JUCE already has a class for that - which is also good since it can measure the average CPU% over multiple runs and account for blocks of different sizes.

3 Likes