LeakedObjectDetector not thread safe?


#1
    static LeakCounter& getCounter() throw()
    {
        static LeakCounter counter;
        return counter;
    }

Definitely a problem if two threads call getCounter for the same OwnerClass at the same time.

This can be fixed by moving the static counter to file scope and making it type StaticData <Atomic >, and removing the constructor for LeakCounter. This is class StaticData:

// Copyright (C) 2008-2011 by Vincent Falco, All rights reserved worldwide.
// This file is released under the MIT License:
// http://www.opensource.org/licenses/mit-license.php
//
// Wraps an object up so it can exist at file scope with static storage
// duration. The object is initialized with zeroes as per the C++ spec.
// Non-trivial constructors will silently fail with undefined behavior.
//
// Spec: N2914=09-0104
//
// [3.6.2] Initialization of non-local objects
//
//         Objects with static storage duration (3.7.1) or thread storage
//         duration (3.7.2) shall be zero-initialized (8.5) before any
//         other initialization takes place.
//
template <class Object>
class StaticData
{
public:
  Object* operator-> () { return getObject (); }
  Object& operator* () { return *getObject (); }

  const Object* operator-> () const { return getObject (); }
  const Object& operator* () const { return *getObject (); }

  Object* getObject ()
  {
    return reinterpret_cast <Object*> (m_storage);
  }

  const Object* getObject () const
  {
    return reinterpret_cast <Object*> (m_storage);
  }

private:
  char m_storage [sizeof (Object)];
};

#2

Actually now that I look at it that is not a safe fix because the LeakCounter destructor won’t get called thus breaking leak checking.


#3

This is how I fixed it. Is it a hack? I guess you could say that. Again I am taking advantage of the C++ spec that states that objects with static storage duration and without constructors get zero-filled:

  class LeakCounter
  {
  public:
    // constructor intentionally omitted

    ~LeakCounter()
    {
      const int count = getCounter().get ();
      if (count > 0)
      {
        DBG ("*** Leaked objects detected: " <<
             count << " instance(s) of class " <<
             getLeakedObjectClassName());
        jassertfalse;
      }
    }

    inline Atomic <int>& getCounter ()
    {
      return *reinterpret_cast <Atomic <int>*> (m_storage);
    }

  private:
    char m_storage [sizeof (Atomic <int>)];
  };

#4

TBH although I’m not disagreeing with any of the points you’re making, I really don’t care very much about this issue… The leak detector’s just a quick-and-dirty debugging tool, not intended to be perfect or foolproof, and it seems to me that the chances of two threads hitting a race condition when they both try to make the first call to a static method at the same time is pretty tiny.

If I was writing an app where this actually happened often enough for it to be annoying, I’d probably just work around it by creating and destorying a dummy instance of the class before kicking off the problematic threaded code. (And in fact, a neater solution that what you’ve got there might be to get a file static constructor to invoke getCounter(), ensuring that all these things are safely initialised before any user code starts to run).


#5

And not compiled with gcc 4.0 or later (which by default outputs the necessary additional code to call the constructor with thread-safety). But regardless, the information is here if at some point someone needs it.


#6

[quote]it seems to me that the chances of two threads hitting a race condition when they both try to make the first call to a static method at the same time is pretty tiny. [/quote]Any race condition that you can describe will happen with a certain frequency if you run it enough times on different machines - and that frequency is usually surprisingly large.

The worst is that a few machines will be a lot more susceptible to a given race condition than most others - it’s not like there will be very rare errors, spread out over a lot of machines, but that a few individual users will experience very errorful programs.

And of those surprising race conditions, I’d say that the largest single cause involved a segment of code racing with itself, precisely because such cases are so common, and because having multiple threads doing the same thing at the same time on different data is in fact a very common use case.

A prime cause of intermittent bugs is not protecting your accessors with mutexes - something that Juce is definitely guilty of and that I worry about.

The trouble is that any variable that’s over one word in size is not written atomically - so if you write to a long long or long pointer on a 32-bit machine, and someone else is reading that long long and there is no mutex, then one day they will see it only partly updated and get an error.

On the other hand, if you mutex everything your code becomes slow.

I solve this by dividing my C++ classes up into “Data” and “Objects”.

Data is “plain old data” - either “C structs” or Google protocol buffers - which has no semantics/methods which aren’t accessors/manipulators, and which has no locking at all. If I use data, I need to make sure that it’s consistent, often passing by value/copy rather than by reference. Data is copiable!

Objects are classes that hide their data, generally aren’t copiable, and contain a lot of “state”. Some of them are thread-safe, and if so, I mutex every accessor (yes, even booleans, to make sure I get a consistent state for the entire object!) - which makes accessors to these things a little “expensive” which is why use use “Data” to copy everything around.

Now, something like a LeakedObjectDetector should definitely be thread-safe, even if it’s a little more expensive. If you’re putting a leak detector in your code, you’re more worried about correctness at that stage than you are efficiency. Efficiency comes when you compile in Release/Optimized mode and the LeakedObjectDetector isn’t even there any more…

Again, thread collisions are more likely during the shutdown phase of an application, precisely because a lot of objects are now calling the same code at the same time. Having your application start to crash during shutdown just because you made some inconsequential change is very disheartening and it has happened to me. Save some poor developer this year or next, put the mutexes in and save them from worry!


#7

Actually no mutex is needed. My proposed change is no different performance-wise. Actually, it’s a hair faster (because there is no hidden boolean inserted by the compiler for objects with static storage duration at function scope).

So in this case we can have our virtual cake and eat it too!

Let me remind interested readers that this issue with Juce and static variables in functions exists in the juce RunningThreadsList, the implementation of juce Singletons, and the LeakedObjectDetector (that I know of so far).


#8

Okay I have developed a generic solution that replaces the old StaticCriticalSection and lets you instantly repair static variables in functions so they are thread safe with a ONE LINE CHANGE!!!

Generic StaticObject, any static var can be thread safe!

BEFORE, NOT THREAD SAFE :frowning:

    static LeakCounter& getCounter() throw()
    {
        static LeakCounter counter;
        return counter;
    }

AFTER, THREAD SAFE!!! :smiley: :smiley: :smiley:

    static LeakCounter& getCounter() throw()
    {
        static StaticObject <LeakCounter> counter; // HOLY SCHITTE! Just one line !?!
        return counter;
    }

Wow! That’s amazing!


#9

So first…

HEY, JULES, THIS IS A GOOD THING! MAKE THE LEAK DETECTOR THREAD SAFE, PLEASE!

Second, let me refresh my mind a little on local statics in C++… (and no doubt the readers’)…

[list][]Their creation is already thread-safe in fairly recent versions of gcc and most other compilers - right?[/]
[]Their creation is a little better defined than most globals, because you know they get created exactly the first time the code flows over them - correct?[/]
[]You still don’t know if other globals are created before or after them.[/]
[]You still don’t know when it’s destroyed.[/][/list]

Am I correct with all of this?

Third, this is a really slick thing, TheVinn, if it works. Thread-safety is really hard to achieve, and you can get 99.9% of it and never see it break on your test machines and still be wrong. With all due respect, how do you know it works? Clearly you’re thinking much more about thread-safety and synchronization than almost anyone else - but have you had the code reviewed by someone else really critical? How much use has it had?

Fourth, I note you use JUCE’s yield command. I haven’t completely worked out how effective, if at all, that command is, nor how well-specified it is. There is certainly little documentation on it - but on the Mac it’s “shared_yield”, on the PC it’s “Sleep(0)”. I’m not as familiar with the PC’s architecture, and I’m not 100% familiar with the Mac’s yield.

I do know that when I worked on Java, yield didn’t necessarily guarantee in every implementation that if there were other threads waiting at the same priority, any of them would gain control before “the current thread” did again - making it basically useless to avoid thread starvation because it’s a “suggestion” more than a “command”.

Fifth, exposing a personal weakness, I tend to only use locks (CriticalSections) to ensure thread-safety because I simply haven’t understood anything else. Both you, TheVinn, and Jules are clearly managing to avoid these by using the techniques you’re using here or Jules uses in e.g. juce_Atomic.h - I’m curious if either of you has a good reference on how to do lock-free thread-safety.

It’s unlikely I’m going to switch from using locks in my current project - I have bugs in the program but neither sluggishness nor race conditions have been amongst them, so I think I have all that working well.

Sixth, now I’m back from my long travels, we should actually get together the Juce “fansite”, where we can put this all together - this week is good for this, did we agree on Github as the solution, should I just create it?


#10

Okay, the leak detector is “mostly” thread safe for normal usage. The only time it is an issue, is if two or more threads create the same leak-checked object at the same time. As Jules pointed out, this is quite rare. So it’s not like a crippling defect. It would only come up in a small subset of use-cases.

Can’t comment on other compilers but definitely NOT Visual Studio 2008.

Yes, subject to the caveat regarding thread safety.

The only way for a static function global to get created before another global is if it is called from the constructor of an object with static storage duration at file scope.

They are destroyed in reverse order of creation by an atexit() function inserted by the linker.

Well, you can check it mathematically by writing out all of the linearized two-thread permutations. In other words, generate a timeline for each possible linearized series of statements and verify the correctness of each timeline. But also note that what I have written is almost identical to what the gcc compiler inserts. It is also similar to the “double checked locking” idiom (http://en.wikipedia.org/wiki/Double-checked_locking).

No, and as far as I know I’m the only one using it. But you should be able to verify that it works simply by looking at it.

The yield should not really be necessary, as long as the operating system ensures that each thread makes a little bit of progress every so often, the algorithm is guaranteed to succeed. The worst case scenario is that one or more CPUs will experience high utilization for the time it takes to construct the object.

My template assumes that the construction of the underlying object is relatively fast.

I avoid locks like the plague. The closest I get in terms of locking is a read/write mutex which is lock-free in it’s fast path. And this is used quite sparingly and only for heavyweight operations. Changing the sample rate on a resampler is definitely NOT a heavyweight operation. I’m talking about something big like changing a connection in the audio graph.

I don’t have any references, but my opinion is that the most important building block for a concurrent system is a ‘thread queue’, which I talk about in my other thread (http://rawmaterialsoftware.com/viewtopic.php?f=8&t=6959). If you design your system to always use a thread queue to update shared variables, you will naturally avoid locking. Well, except for the lock protecting the thread queue. But even that can be eliminated by replacing it with a wait-free data structure (http://groups.google.com/group/lock-free).

I agree. Changing the concurrency model would be almost impossible without a complete rewrite.


#11

re: mostly thread-safe.

True, and Jules has a lot on his place. It’s a pity parts of the project couldn’t be community developed.

Is anyone actually reporting this? Or is it a hypothetical?

Very good to know about Visual Studio 2008 statics. I’m going to have to look at my code to make sure I don’t have possible issues there. My recollection is that it’s thread-safe for any GCC that’s 3.x or up, couldn’t find evidence but pretty sure of this…

They are destroyed in reverse order of creation by an atexit() function inserted by the linker.

Yeah, but you don’t really know what that order of creation is, either! So it’s still risky.

The yield should not really be necessary, as long as the operating system ensures that each thread makes a little bit of progress every so often, the algorithm is guaranteed to succeed.

I looked up my Java threads, and apparently I was right, though in practice it’s better these days - but this isn’t Java.

It really comes down to how good the thread manager is at avoiding thread-starvation. And from my reading, it seems that they’re pretty good, and if your threads yield politely they’re even better.

the most important building block for a concurrent system is a ‘thread queue’,

So interestingly enough I have something exactly like this in my code, though it works differently.

My “Data” are persistent, serialized data objects (which happen to be Google protocol buffers so you can assign, copy and swap them). You can “read” them - that is, get a coherent snapshot copy of the current value - but to “write” them you need to send an Operation (also serializable) to… a queue exactly like this thread queue, which will eventually propagate updates to all the listeners - I actually have one queue per type, mainly so that everyone doesn’t see every update.

So there are very few locks. There are some classes that are used by both the GUI and calculation threads, and they need to be synchronized, but there aren’t very many of these, and these synchronized methods aren’t called very often.

Nice to have reinvented the wheel! :smiley:

Changing the sample rate on a resampler is definitely NOT a heavyweight operation.

Hmm, well, I identify four levels of real-time on a computer.

[list]
[]There’s the fastest time - sample time, where you’re processing one video or graphical pixel or one audio sample.[/]
[]One step below that is block time - where you’re processing a group of samples at once.[/]
[]Then there’s controller time - where you’re reading some continuous controller from the user, like a mouse drag, a slider move, key pressure from a musical keyboard.[/]
[]The slowest is switch time - where you are pressing a button on the display to switch things around.[/][/list]

Now, the higher the real-time level, the worse locking is. If you lock at sample time, your code will be very slow. But if you have a background calculation thread, it seems impossible not to lock or at least synchronize once in block time in order to update the outside world with the intermediate progress in your calculation…

Changing the sample rate on a resampler is probably either in controller time, or switch time. So I don’t see why this is necessarily a bad place for a lock! In particular, changing this parameter might well set off some calculation that takes a long time to complete, so you might even want to lock twice - once before you start resampling with the new sample rate, and once when you’re done.


#12

Heh well in my case, I change it per-sample. If you try the DspDemo you can control the sample rate in real-time, it’s being used as an economy pitch fader.


#13

The problem with using a lock to change a shared state instead of a thread queue, has less to do with performance and more to do with building a robust, predictable system.

With a lock, you have to expect that your variables can change at any point. This increases permutations of statement ordering in a concurrent system, which creates an explosion in the number of timelines that require analysis.

With a thread queue, functors are executed at a well defined moment in time, on the associated thread. In the case of the audio callback, the queue is processed upon entry into the audio I/O callback. This creates well-defined behavior and allows us to make strong statements with respect to the correctness of algorithms that utilize the thread queue to update shared states.

In a lock-based architecture, adding a lock or piece of share data requires a re-analysis of the entire system. With the queue, it is only necessary to make sure that the queue itself, and the code which calls it, is correct.

Knowing that the audio I/O callback will not be blocked or interrupted once the thread queue is processed, we can make some very strong statements with respect to the run-time performance of the code contained inside.

A lock-based architecture assumes a single state which has read/write access. This is appealing to newcomers becomes it assures us that everything is always up to date. But this comes at a price, any inspection or modification of the state serializes execution - the equivalent of a single file line at the check-out counter.

With a thread queue, it becomes natural to give each thread it’s own copy of the state with the possibility that sometimes, these states will be out of sync momentarily. On a multi-core system, such is the state of affairs regardless (think Byzantine Generals problem here). Having per-thread states is quite cache-friendly as well.

Almost all of the parallelism frameworks out there (Intel Thread Building Blocks, FastFlow: http://calvados.di.unipi.it/dokuwiki/doku.php?id=ffnamespace:about) have at their core, optimized implementations of thread queues. They go by various names, depending on their implementation and level of concurrency:

SPSC - Single Producer Single Consumer
MPSC - Multiple Producer, Single Consumer (this is what’s in DspFilter)
MPMC - Multiple Producer, Multiple Consumer

and the variations of sychronization;

Blocking: Traditional mutex based (this is what’s in DspFilter)
Lock-Free: No mutex, but the possibility of spinning
Wait-Free: These are the fastest types of queues. Enqueue and dequeue operations execute in constant time regardless of concurrency.

In summary, thread queues are a simple but powerful method of synchronization when building concurrent systems.


#14

+1, please :slight_smile:
Seriously, I use AudioFormatWriter objects in several concurrent threads, and each call to writeFromAudioSampleBuffer will hit the DBG ("*** Dangling pointer deletion! Class: " << getLeakedObjectClassName());, causing an assert in Debug (about the chans HeapBlock<int*> object) for each buffer write which quite a pain in the ass…

cheers


#15

Guys, seriously… like Vinn said, this FUD around the thread-safety of LeakedObjectDetector is really misguided.

Let me explain it again, hopefully for the last time: there’s only one way the LeakDetector could have a threading problem, and that’s if multiple threads all try at the same time to create the very first instance of a class. This can only happen ONCE in your app’s lifetime, and can’t happen at all in a GCC build. In most apps it can never happen, but even if your code can trigger it, the chances of it happening are tiny. The chances of it happening repeatably are infinitesimal. And even if you did hit it, all you’d need to do to avoid it would be to create a temporary dummy instance of the class before you kick off any threads. That would initialise the static data, and afterwards it will be absolutely thread-safe.

And in this case, since a HeapBlock will definitely been used somewhere before you launch your threads, this CANNOT be a threading issue, end of story!

My suspicion is that it probably really IS a dangling pointer deletion, but if it’s not that, then it’ll be some kind of obscure linkage/dynamic library/static data order-of-initialisation thing. But NOT a thread problem!!


#16

Well it should be pretty easy for you to apply my simple change to your juce sources and verify that it makes your problem go away…have you done that?


#17

well thanks, but I just updated to a more recent revision (I was using a February snapshot of the lib) of the repositery and the LeakDetector no longer reports these fake dangling issues.

Cheers