Found a serious design flaw with MessageManagerLock(Thread*) and MessageManagerLock(ThreadPoolJob*)

The MessageManagerLock class can take in an optional Thread* or a ThreadPoolJob* object, and can bail out of an attempt at getting a lock if either of these objects signals that it wants to exit.

The way this works is, the MessageManagerLock adds itself as a Listener (via addListener) in case signalThreadShouldExit() or signalJobShouldExit() gets called. If this happens, the MessageManagerLock removes itself as a listener (via RemoveListener()) and exits.

This all seems fine, but there is no locking in either the Thread’s or the ThreadPoolJob’s Listeners lists. When signalThreadShouldExit() or signalJobShouldExit() gets called, these functions use an iterator to go through their lists of listeners and notify them that it is time to exit. However, when MessageManagerLock calls RemoveListener from another thread, it modifies the list of listeners while we’re iterating through them. Because there is no critical section, we end up very easily in a situation where the list has iterated to an item that no longer exists.

For example, if the Thread myThread had 1 listener, and MessageManagerLock(myThread) gets called, the Thread will temporarily have 2 listeners. When myThread->signalThreadShouldExit() gets called, it will iterate through the 2 listeners to notify them to exit. After the MessageManagerLock receives the message, it removes itself from the Thread’s Listeners lists. There are now only 1 listeners, but the iterator will likely have moved on to the second item in the list, and we will crash.

I’ve confirmed this with both Thread and ThreadPoolJob objects. We crash over 50% of the time. I imagine this could be fixed pretty easily by putting a lock around the juce_listenerList call() function, as well as the add() and remove() functions.

Please let me know if I’m mistaken.

Thanks!
Dan

1 Like

OK Thank you for the report. It’s actually an issue with the ListenerList class which I would have expected to lock the underlying Array’s lock when going through the list to call all of it’s listeners. I’ve fixed this on develop with commit 89ec137 which will appear on the public repo in a few minutes.

We will also hotfix this on master if no one reports any issues with this (and it fixes your issue).

This fix is working great. Thank you!

Dan