Well here we are almost three months after the original report - sorry this took so long folks this ended up being way more mammoth than I would have ever imagined. Please do everything you can to test this as much as possible, we have done a lot of internal testing of this already but the more the merrier.
With a commit title such as “HighResolutionTimer: Complete rewrite” maybe some more details about the problems and solutions, and any special considerations for usage and areas to watch out for in testing might be helpful.
@Verbonaut great question, hopefully most of the considerations are captured in the new unit tests. The main thing not committed to the unit tests is performance tests because these rarely fair well on CI.
A couple of key things I can think of
-
You’re code using the high resolution timer shouldn’t suffer any unexpected performance penalties, ideally it should be as good as, if not better than, it was in JUCE 7.0.2
-
On average the time between callbacks should be very close to the specified time
-
Wherever possible the time of the callback should be as close as possible to the expected time, it’s not possible to have it always on time but when set to a short interval there should be as few as possible callbacks being called in bunch at once due to missed events
-
The thread priority has been dropped to highest rather than realtime, in our testing this had no noticeable impact and we would prefer to keep things off the real time threads if we could, maybe you notice an issue with this, or rely on this?
-
No hangs, it’s very easy to write a timer where the chance of causing a hang due to a users use of a mutex around the timer and in the callback can cause some unexpected deadlocks, the chance of this should be reduced compared to previous versions
-
-
When calling
stopTimer()orstartTimer (0)it really does wait for all callbacks to finish, no accidental calls to the callback- Note: we no longer wait for callbacks to complete when calling
startTimer (n)wherenis greater than 0, this should improve performance on the threads interacting with the timer but without any observable difference as far as we can tell (isTimerRunning()should still return the correct expected result)
- Note: we no longer wait for callbacks to complete when calling
-
No unexpected behaviour, most of this is captured in the unit tests, but maybe there is a use case we’ve missed or not considered that we could capture in the tests?
-
No race conditions, thread sanitiser should not trigger
-
No breaking changes, all existing code should function exactly as it did before (unless you were relying on some undefined behaviour, but even then we would like to know about it)
Note we’ve aimed to keep the interface identical, but there is one extra “feature” we’ve added. It should now be possible to check if your timer actually started (just call isTimerRunnung() after starting it - this may have not been reliable in the past). It’s possible that when you start a timer it actually doesn’t start! For example it could be because a thread wouldn’t launch. On Windows specifically there can only be a maximum of 16 HighResolutionTimers running at once, this is due to the underlying API we rely on (we tested lots of different ones, the one we’re using has significantly better performance) and should have also been true in version 7.0.2 (if I’m not mistaken). We could work around this but there will be a penalty for doing so - so if you need more than 16 timers running at once let us know!
One other thing I would say is that if you rely on calling stopTimer() in a callback and never call it from outside, I strongly recommend still calling stopTimer() in the destructor of your derived class - this isn’t special to this version, just general good practice. In an ideal world I wouldn’t have it that users inherit from HighResolutionTimer I would prefer an interface where the Timer is composed. In my own code I would probably write something like this (completely untested)…
class HighResolutionTimer : private juce::HighResolutionTimer
{
public:
using Callback = std::function<void()>;
HighResolutionTimer (Callback callback) : cb { std::move (callback) } {}
~HighResolutionTimer() final { stopTimer(); }
// could probably take a better type than int
// maybe return an enum to indicate if it started or not too
void start (int ms) { startTimer (ms); }
void stop() { stopTimer(); }
bool isRunning() const { return isTimerRunning(); }
private:
void hiResTimerCallback() final { cb(); }
Callback cb;
};
Could alternatively use a listener rather than std::function, that could be particularly nice if lots of things need to be triggered by the timer in no particular order. One downside would be having to consider thread safety when adding/remover listeners though.
One other issue that really caught me out that should now be totally resolved due to a refactor, is that destroying a unique_ptr to an object is not quite the same as destroying an object on the stack in a way I hadn’t fully appreciated before. The unique_ptr will set its internal pointer to nullptr before actually deleting the object. What this meant for us was that in the destructor we might have been stopping the timer and preventing any callbacks before the object was destroyed but if a callback occurred after the pointer was set to nullptr and before the destructor was finished, and that callback tried to call back into the pointer it would hit undefined behaviour by dereferencing a nullptr. That one really took me by surprise.
As the guy who originally reported this issue, I’m sorry I couldn’t get to evaluating this right away, I was in the middle of some things.
But anyway, today I have updated to the latest develop.
ideally it should be as good as, if not better than, it was in JUCE 7.0.2
I know this was a huge amount of work etc., but it still seems not as solid as it was in JUCE 7.0.2. (I am testing Mac only so far.)
In JUCE 7.0.2, on the Mac, with a 1 ms callback, there was never any jitter over a 10th of a millisecond between callbacks. Now there is.
This is the link to my test project up above, here it is again:
HiResTimerTest3.zip (13.6 KB)
The “Start Jitter 10th” buttons prints out any 1 ms callback that has a jitter of greater then .1 ms from the previous one.
JUCE 7.0.2 - jitter between 1 ms callbacks that exceeds a 10th of a ms:
[NONE]
JUCE 7.0.5 develop - jitter between 1 ms callbacks that exceeds a 10th of a ms:
0099: timer 10th jitter +0.100772
0138: timer 10th jitter -0.118654
0161: timer 10th jitter -0.109102
0163: timer 10th jitter +0.138047
0211: timer 10th jitter -0.113110
0218: timer 10th jitter +0.136156
0219: timer 10th jitter -0.124299
0247: timer 10th jitter -0.104654
0251: timer 10th jitter +0.106366
0392: timer 10th jitter +0.116671
0393: timer 10th jitter -0.108417
0443: timer 10th jitter -0.121595
0444: timer 10th jitter +0.127838
0464: timer 10th jitter +0.122664
0503: timer 10th jitter -0.120573
0505: timer 10th jitter +0.102194
0515: timer 10th jitter -0.107997
0565: timer 10th jitter +0.117793
0570: timer 10th jitter +0.102272
0588: timer 10th jitter -0.145588
0589: timer 10th jitter +0.119256
0593: timer 10th jitter -0.104257
I can report that It DOES solve the issue of accumulating callbacks over a period of time; i.e 30,000 1 ms callbacks = 30 seconds, so thank you for that!
Is it possible the way I am measuring this is not accurate? (You can download my HiResTimer3 project above.)
But I’m, just curious. I wonder why the original implementation had to be completely redesigned? Why is it no longer possible to get the same level of accuracy as before?
Perhaps this is miniscule, and I’m being overly picky, as I will admit I cannot “hear it” - but the fact is I can measure it.
EDIT/PS:
I just tested it on Windows. It actually does seem to be about the same or perhaps even slightly a tiny bit less jitter than 7.0.2, so that’s a win. ![]()
@stephenk don’t be apologetic, thanks for testing. To answer the question as to why we ended up rewriting it, is that in the end the original implementation simply wasn’t passing all the tests! We did in some cases parts from the original though. I will take a look at this today.
The first thing that comes to mind is that the thread priority was dropped from realtime to high priority. Could you try and change it back to real time to see how the performance changes please? This is the line that would need changing.
OK I can reproduce your results, making it realtime doesn’t appear to have an obvious immediate effect on my machine - at least it didn’t jump out to me just looking at the output.
I’ll keep looking at this, I have some thoughts already regarding the testing, lets see what comes out of it and I’ll report my findings here.
OK so I concur that frankly what is a very naive implementation on macOS works better than using a CFRunLoopTimer
apologies for not spotting that! The good news is we developed a “generic” timer at the same time which we used for WASM and BSD as we didn’t want to spend too much extra time on those platforms as well. It passes all our tests, is less code than the 7.0.2 version, and with some very minor tweaks it now outperforms the 7.0.2 version on my system too. I need to do more testing. I want to run tests on more systems and I would like to look at CPU load as well. I’ll also go back and test the other platforms. I’ll try and have something up very soon though.
@anthony-nicholls Thanks so much for all of this! Will watch for further details.
Hey @stephenk, it is unsurprisingly a little more complex than I first thought. Things get a little more interesting if you start firing up multiple timers. Also a part of the problem was the old implementation was using a real-time thread and one of the reasons my tests with real-time threads wasn’t working was an issue with setting real-time thread priorities in the Thread class. I’ll probably still be on this tomorrow.
@anthony-nicholls Thank you, take your time - there’s no rush on it from me. ![]()
Another week passes on the HighResolutionTimer.
So I have been doing some extensive testing and this is what I’ve found.
- A
CFRunLoopTimer(the current develop implementation) is no faster than a generic implementation using a condition variable. CFRunLoopTimermay drop callbacks, this will very slowly cause some drift! In my testing this was easiest to reproduce when running the timer while performing another task.- A Realtime thread on macOS does improve stability.
- A generic implementation using a condition variable running a 1ms interval timer on Windows is significantly less stable than macOS or Linux.
- A generic implementation using a condition variable is almost as stable as a native solution on Linux (within 10’s of microseconds).
- macOS Realtime threads are not working on the develop branch (fix incoming).
Decision
- Use a generic timer on macOS.
- Use a high priority thread to run the macOS timer, although using a realtime thread is more stable this…
- Improves consistency between platforms.
- Frees up realtime threads for more appropriate use cases.
- Reduces the number of things to consider when debugging.
- Keep the current develop branch implementation for Windows.
- Use a generic timer on Linux.
I’ve produced some graphs to show the performance you should be able to expect on develop (and master) shortly. It’s worth noting that the results of the Linux and Windows implementation here are very similar to 7.0.2, it’s only macOS that is changing due to the thread priority change discussed above. It’s also worth highlighting that macOS is still as performant if not better than other platforms.
The graph shows the absolute error compared to the expected arrival time of a callback and its actual arrival time when running a 1ms timer for 10,001 callbacks (producing 10,000 readings). I’ve done longer tests too but there were quite a few branches and variations to test!
The average error measured was
- macOS: ~67 microseconds
- Windows: ~73 microseconds
- Linux: ~173 microseconds
Hi Anthony, thanks for all the work on this issue!
I’m trying to digest everything you posted above.
How then does the expected performance of the new implementation on MacOS compare to the previous implementation on 7.0.2? I’d be curious to see a graph of that.
And is what you’re putting on develop for MacOS next going to be different/better from the previous develop?
And are you saying that “it’s not going to be as good as 7.0.2, but it’s still better than Windows or Linux?” I’m unclear on that… Thanks!
No problem at all, thanks for your test app it was very helpful.
I’m reluctant to post a graph as it’s too easy to take out of context. 7.0.2 runs on a realtime thread, if I put the new implementation on one too, it’s about the same. So really we’re comparing realtime threads to the next highest priority thread. The difference between the two is in the 10’s of microseconds.
- Realtime thread: average ~5-7 microseconds of error
- Highest priority thread: average ~73 microseconds of error
Given we’re talking about a timer with a maximum resolution of 1ms, I’m not convinced worrying about a few 10’s of microseconds is worth the cost.
Roughly similar performance. However, most importantly I noticed a different issue with the implementation on macOS, which is that it can sometimes drop callbacks (this might appear as a large wait between two callbacks). This can lead to a drift over longer periods of time. Frustrating I didn’t spot this before but I noticed this happens most when I try to do other things while the test is running, so maybe that’s why I missed it.
In terms of measured performance, 7.0.2 will out perform the new macOS implementation. As discussed above this is only because it’s running on a realtime thread. However, even when we run either implementation on the next highest priority thread it still performs as good as, if not better than, the best Windows and Linux implementations I measured.
Hope that helps clear things up.
Thanks for the details. So, if I were so inclined, is it relatively easy to put it on a realtime thread myself? I already have my own fork of JUCE with some changes. Is it just this one line? And given that I’m not running any other hiRes timers, or any other threads (of my own making), would there be any downside?
Yes if you want to do it with your own fork probably best to change it to…
startRealtimeThread ({10, 1}) || startThread (Priority::highest);
However, while testing this I discovered startRealtimeThread isn’t working on develop for macOS, which is why originally I wasn’t seeing a realtime thread make a difference and what left me so confused by the seemingly impressive performance of 7.0.2.
A fix for that is on it’s way too, HiResTimer stuff might make it first though.
Note, using a realtime thread won’t help on Windows, we can’t control that thread unfortunately, although it’s probably already as close to realtime as possible.
I didn’t manage to get realtime threads running on Linux but it’s possible it will help there too. I do plan to try this out ASAP.
The reason I say this, is that if your app is cross platform keep in mind you may have great performance on macOS but it’s not going to be nearly as great elsewhere.
If you do think you have a use case where a realtime thread really helps I’m eager to learn more.
Good to hear how much work is going into this. I think there are a fair number of people who might be hoping to get the best possible precision on macOS and Linux but don’t have particularly high expectations for Windows…
So it might be nice to have overloaded constructors that could make use of either High Priority threads (perhaps as the default), and also a Realtime thread option where possible (really hoping for macOS and Linux where it seems most feasible).
There are some surprising applications for the JUCE library where this difference in jitter will really, really matter!
That’s a good point. I understand @anthony-nicholls point about the Mac version performing way better than the Windows version perhaps being an issue, if you are deploying cross-platform.
But what if you were just making a MacOS app? Wouldn’t you want to get the best possible performance (which is known to be achievable) rather than having it “equivalent or slightly better than Windows”?
Couldn’t it be possible to just have an option to launch it with a realtime thread versus a high priority thread?
Mobile platforms haven’t been mentioned. How does this affect iOS and Android?
Could you both please provide some detail about where the difference in performance is critical?

