I am looking to do some analysis of how long my audio callback takes to run. I would like to measure this through use of std::chrono::high_resolution_clock. I understand this would technically be a system call which is usually not acceptable in a real-time thread, although here it should be accessing an OS dedicated high res clock. The alternative I assume is to use OS specific calls such as QueryPerformanceCounter on Windows however, this obviously is much more work than using a universal system. Just wondering what other peopleās opinions are on the real-time safety of these functions and if there are any alternative methods of achieving this?
I think you might find it more productive to integrate melatonin_perfetto in your project, enable it for your Debug builds, and just forget about any other tricky business you might do to convince yourself about your apps performance:
And while youāre at it, donāt forget to profile your GUI code as well with melatonin_inspector:
Both of these are easy to add to any JUCE project, very easy to configure so that theyāre available when you need them (and not available in your Release builds), and will provide you with a lot of insight into your plugin/apps performance meanwhile ..
Wow, that looks really useful. Thanks! I will definitely take a look into that for profiling our performance.
Weāre not just looking at this for debugging and performance profiling though. Our application has multiple real time audio threads since it can take multiple audio input types. Itād be really useful for a few reasons to be able to time these audio input callbacks. Weād be looking at these timings affecting the processing of the application, hence having to do them in both debug and release builds.
Understood. Well, you can either use the Analytics class directly, or if you just want the facts, juce::Time::getMillisecondCounterHires() might be a portable way to do it⦠donāt forget to factor in juce::Time::getHighResolutionTicksPerSecond() into your analysis, too ..
Thereās nothing wrong with using a high resolution clock for the purposes of profiling your processBlock. I wouldnāt enable it for production builds but for diagnostic purposes youāll get just fine results.
When people say you shouldnāt use system calls in real-time code, theyāre talking about avoiding things like priority inversion or disk flushes or wait states. But for profiling your code just to see how long it takes, youāll be just fine.
Btw, VTune is free these days. With the IIT API you can define tasks (like processBlock) to see for example how long the calls are, how much that duration varies, how it interacts with other threads, etc. Perfetto looks good too, just throwing the VTune option out there since Iāve been using it for decades and love it.
Just checking, you are talking about this, correct? GitHub - intel/ittapi: IntelĀ® Instrumentation and Tracing Technology (ITT) and Just-In-Time (JIT) API
If Iām not mistaken, this is only available for Intel and I think itās fair to assume, most people would like a solution that is at least also compatible with Apple Silicon systems.
@caustik - I for one would love to hear more about your experiences with using VTune. I donāt think anyone is attempting to argue with you - more that, based on the knowledge shared so far in this thread, that performance profiling is a very finicky subject, and that optimizations for one platform donāt necessarily carry over to another. VTune seems pretty sweet, Iāve encountered it over the years, but the load to set it up has never been viable for my projects - perfetto works just fine, as do liberally applied timer calls.
Also, I did mean āenable for Debugging but not Releaseā, but you are of course correct that one should profile Release builds rigorously too. I think there is a gradient scale of attention to detail needed - there is no āone solution fits all casesā scenario, imho - early on in algorithm development, liberal Timer calls placed here and there can provide you with necessary insight, but then again once you get to cross-platform/multiple-target builds, having Perfetto at hand to easily measure things can show issues with architecture quite well, also.
JUCE already has a class for that - which is also good since it can measure the average CPU% over multiple runs and account for blocks of different sizes.
