RunningThreadsList headaches

Consider this

Manager.cpp

struct MyThread : Thread {
  MyThread() : Thread ("TroubleMaker") { }
  void run () { /* do stuff */ }
};
struct Manager {
  Manager () { thread.startThread (); }
  ~Manager () { thread.stopThread (-1); }
  MyThread thread;
};
static Manager manager;

juce_Thread.cpp

...
       static RunningThreadsList runningThreads;
...

On application exit, if runningThreads is destroyed before manager, then we get asserts and potential undefined behavior. Some Thread member function implementations depend on the thread existing in the RunningThreadsList, such as Thread::getCurrentThread().

The only robust solution I can think of, is to change RunningThreadsList into a ReferenceCountedObject, created on first use. When a Thread is added or removed from the list, the reference count on the RunningThreadsList object is incremented or decremented respectively.

Furthermore, to keep this object around for RunningThreadsList::getInstance(), we would replace

       static RunningThreadsList runningThreads;

with

       static ReferenceCountedObjectPtr <RunningThreadsList> runningThreads;

This solves the problem. If there are still threads around when the runningThreads destructor is called (which now just decrements the reference count), then the RunningThreadsList object will survive. Logically, instead of destroying the RunningThreadsList when the static variable at function scope has its destructor called, we will destroy the RunningThreadsList only when all references from running threads are gone AND a possible reference from the ReferenceCountedObjectPtr static variable at function scope gets destroyed.

There are a few other places in Juce where this issue exists. The Juce leak checker is another example. So is DeletedAtShutdown. Now I admit, I’m a pretty hardcore user so the average Juce weenie is unlikely to encounter this issue. Still, we should break this habit of implementing singletons using objects at function scope with static storage duration, it greatly complicates writing library code.

Thanks Vinnie, that’s a good idea, I’ll see what I can do!

what about ?

static Manager & getManager()
{
   static Manager manager;
   return manager;
}

Compiler should insure order of destruction as the opposition of the initialization, and the object will only be initialized when calling the getManager() function, so this happens after the Juce’s stuff initialization.

Can you be more explicit ?

The order of destruction for objects with static storage duration, in different compilation units, is undefined - even in the latest C++ specifications.

I don’t know how to be any more clear: NO MORE STATIC VARIABLES INSIDE FUNCTIONS to implement singletons!!

[size=150]JULES!!![/size]

I picked up the changes. Brilliant thinking, to eliminate RunningThreadsList entirely and replace it with ThreadLocalValue (wish I had thought of that). But the underlying problem was not fixed:

static ThreadLocalValue<Thread*>& getCurrentThreadHolder()
{
    static ThreadLocalValue<Thread*> tls;
    return tls;

All we did was change the type of the static variable, not remove it!

We must not use objects with static storage duration !!!

Thanks Vinnie.

I agree that statics are generally to be avoided, but in this case, a thread-local object must be static for it to work at all (and on MSVC and clang it’ll actually be using a compiler-generated thread local now, so isn’t really a static anyway).

But in your original example, I think the idea of having a file-local thread object’s a really bad idea to start with. Ok, you’d get away with it if you make sure the thread is definitely stopped before shutdown begins, but threads are not the kind of object that you should allow to be destroyed in whatever random order the compiler happens to choose. At the very least you should be using a local static, not a file static, so that the runtime can hopefully delete your objects in more or less the reverse order to their creation order.

You are right.
The C++ spec doesn’t impose any implementation for the linker, and the dynamic loading of code, BUT most system do.

On Win32, the linker emits a special section for “static-in-function” object, with an hidden boolean flag telling if it was called.
When leaving the application, this section is parsed, and the static object are freed if constructed.

So, static-in-function is always cleaned before static object. I’m 95% sure it’s the same on Linux.

I only agree with this partly.

I have implemented a “SharedSingleton” object. It is a thread safe, reference counted object that is dynamically allocated. Using this object with proper reference counting (in a RAII pointer holder), it is possible to build up a network of objects that are “file local”. The order of destruction is very well defined as the references come off the objects at exit.

I absolutely need a file-local thread, and I’m sure you will agree - this thread is responsible for calling ‘operator delete’ on objects that get shipped to it. Very helpful for the audio thread and other time critical threads. Its library type code and not user code.

I have a proposed solution incoming.

[quote=“X-Ryl669”]The C++ spec doesn’t impose any implementation for the linker, and the dynamic loading of code, BUT most system do.

On Win32, the linker emits a special section for “static-in-function” object, with an hidden boolean flag telling if it was called.
When leaving the application, this section is parsed, and the static object are freed if constructed.

So, static-in-function is always cleaned before static object. I’m 95% sure it’s the same on Linux.[/quote]

No, most system don’t. In fact, NONE do. Yes, the order of destruction is well defined within a translation unit but across translation units it is UNDEFINED!!!

Here is my proposed solution:

https://github.com/vinniefalco/Juce/tree/threads

I made a branch called “threads” and commited my changes to juce_Thread.cpp. This is JUST A FIRST ITERATION and not well tested.

This is the code for those who are too lazy to browse github:

// Copyright 2012 by Vinnie Falco
// This code is released into the public domain under the terms of
// CC0: http://creativecommons.org/publicdomain/zero/1.0/legalcode

template <typename Tag>
class StaticSpinLock
{
public:
  inline void enter () const noexcept
  {
	getLock().enter();
  }

  inline bool tryEnter () const noexcept
  {
	return getLock().tryEnter();
  }

  inline void exit () const noexcept
  {
	getLock().exit();
  }

  typedef GenericScopedLock <StaticSpinLock>   ScopedLockType;

  typedef GenericScopedUnlock <StaticSpinLock> ScopedUnlockType;

private:
  SpinLock& getLock () const noexcept
  {
	return *(reinterpret_cast <SpinLock*> (lock));
  }

  static char lock [sizeof (SpinLock)];
};

// Spec: N2914=09-0104
//
// [3.6.2] Initialization of non-local objects
//
//         Objects with static storage duration (3.7.1) or thread storage
//         duration (3.7.2) shall be zero-initialized (8.5) before any
//         other initialization takes place.
//
template <typename Tag>
char StaticSpinLock<Tag>::lock [sizeof (SpinLock)];

//------------------------------------------------------------------------------

class CurrentThreadManager : public ReferenceCountedObject
{
public:
  typedef ReferenceCountedObjectPtr <CurrentThreadManager> Ptr;

  static CurrentThreadManager::Ptr getInstance ()
  {
	static Ptr instance;

	SpinLockType spinLock;

	{
	  SpinLockType::ScopedLockType lock (spinLock);
	 
	  if (instance == nullptr)
	  {
  		instance = new CurrentThreadManager;
	  }
	}

	return instance;
  }

  Thread* getCurrentThread ()
  {
	return *threadLocalValue;
  }

  void setCurrentThread (Thread* thread)
  {
	threadLocalValue = thread;
	incReferenceCount();
  }

  void clearCurrentThread ()
  {
	threadLocalValue.releaseCurrentThreadStorage();
	decReferenceCount();
  }

private:
  typedef StaticSpinLock <CurrentThreadManager> SpinLockType;
  ThreadLocalValue<Thread*> threadLocalValue;
};

[quote=“TheVinn”]
No, most system don’t. In fact, NONE do. Yes, the order of destruction is well defined within a translation unit but across translation units it is UNDEFINED!!![/quote]
It is (I’m using it for a project that’s Win32 only) :
http://msdn.microsoft.com/en-us/library/7977wcck(VS.80).aspx

I don’t know for gcc/ld, but I wonder there is similar trickery, else ElectricFence and VisualLeakDetector would never work.
That being said, I don’t think it’s worth the trouble anyway, I still don’t get the issue with static-in-function.
If you look at gcc’s output for such code, you’ll see that gcc is inserting a mutex acquisition before calling the static object constructor (so it’s safe calling in multiple thread).
So the code you’ve posted is already done by GCC and obviously faster / safer.

Thread safety is not the problem here! It is order of destruction for objects with static storage duration in different translation units…

Vincent, I understand what you mean.
However, the only case the order of destruction is important is when there is a link between objects.
In general, expecting the system to work out your logic seems kind of playing russian roulette to me.

Usually, if you want a specific order of construction, you’ll use getStaticObj(), there is nothing that prevent you to do the same for destruction:

Obj *& __getObj()
{
    static Obj * ptr = new Obj;
    return ptr;
}
Obj & getObj() { return *__getObj(); }

// Call this when you need too, depending on your logic.
void destructObj() { deleteAndZero(__getObj()); }

Only partially.

Yes, exactly! Any concurrent system that has reference counted objects, and makes use of objects with static storage duration will by definition have objects inter-dependent on each other which are exposed to order of destruction issues. Jules claims that its a bad idea to have such objects but I must disagree. Here are concrete examples:

  • A garbage collector provided by a concurrency library
  • Global thread which deletes objects for you
  • Facility for performing actions on program exit
  • Leak checker

Correct. That’s why we cannot rely on the “system” to call destructors in the proper order, since the order of destruction of objects with static storage duration in different translation units is undefined (I keep repeating this). Instead, we build an implicit directed acyclic graph at runtime where reference counted objects are the vertices, and the reference counts are the edges. At program exit, we remove the one reference with static storage duration, and everything unwinds flawlessly.

Explicit destruction is not possible in a reference counted concurrent system - there is no way to know who is holding the last reference (nor would we want to). This is the part you aren’t getting.

Jules I can confirm that my proposed changes solve the problem - I tested it both with native thread local values (which masks the problem) and with JUCE_NO_COMPILER_THREAD_LOCAL == 1. What do you think of it?

Except that ref count leads to cycles, and you can enter an infinite loop too. I prefer a determinist system to a “pseudo automatic” system.
In general ref count is a good solution to a badly proposed problem. If you simplify your problem, the need for ref count might disappear.

In the examples you wrote, the issues are not linked to Juce (or I’m missing something).
There are simple solutions for action at program exit (DeletedAtShutdown, Leak checking, etc…) that doesn’t require adding ref count to the whole library.

Most GC don’t use ref count also because it’s slowing everything and risky. Off topic, but I still think that GC cause more trouble than they solve.

[quote=“X-Ryl669”]ref count leads to cycles, and you can enter an infinite loop too. I prefer a determinist system to a “pseudo automatic” system.
In general ref count is a good solution to a badly proposed problem. If you simplify your problem, the need for ref count might disappear.[/quote]

I disagree emphatically. If you build a concurrent system using a thread queue, reference counted objects are a necessity for solving all but the most trivial problems. Thread queues are superior for a broad class of problems since they allow for strong invariants. Cycles and infinite loops are caused by bad design, they are not a problem inherent to reference counting. The solution is not to avoid reference counting - just use a good design!

It’s worth saying this: concurrent programming is hard. There are no easy answers or short cuts. “Simplify your problem” is not an option here, a concurrent system has a minimum level of complexity, which is quite a lot higher than a typical application. Anyone who has implemented a multi-threaded host can attest to this.

If you use a juce::Thread to implement my examples, then you are exposed to the problem in the original post. Using juce::Thread is very convenient, it works on all platforms and is a natural choice if you are already using juce. Unfortunately, due to the problem in the original post, I have to use boost threads for those examples. I would like to get rid of my dependence on boost.

I’m not talking about “adding ref count to the whole library” (I assume you mean Juce). I’m talking about code that uses Juce threads which makes use of reference counted objects. There are no simple solutions for “action at program exit”, because the order of destruction of objects with static storage duration is undefined (notice how I keep repeating this).

I never implied a GC implementation that uses reference counting on the objects it manages. I am talking about a GC that is itself reference counted (note the difference).

Any garbage collector has to live longer than the objects that use it (obviously). If the garbage collector is an object with static storage duration, and one of the objects that use it is also an object with static storage duration in a different translation unit…drumroll please…

the order of destruction of these objects is undefined!

Adding a reference count to the garbage collector and dynamically allocating it instead of using an object with static storage duration allows it to live as long as someone is referencing it.

But if this garbage collector uses a juce::Thread (in the current implementation), it is exposed to the order of destruction problem.

Can I possibly make this any more clear?

Vinnie, I understand your point, but I really feel it’s just fundamentally unsafe for your threads to still be running at a time when the C++ runtime has already started tearing down the program’s static objects!

I can see that your suggestion would probably work, but I’m really not sure whether it’s something that I should do. It feels like I’d be creating a workaround for a situation that shouldn’t exist in the first place!

Then you are saying this construct is unsafe / invalid:

class MyService : private Thread
{
public:
	void action (); // does something cool
	static MyService& getInstance ()  {
		static MyService instance;
		return instance;
	}
private:
	MyService () { startThread(); }
	~MyService () { stopThread (-1); }
	void run ();
};

Juce itself makes use of the construct in the code snippet above (with ThreadLocalValue instead of a Thread object for example) in many places. If you’re saying that the code is not valid, but yet Juce itself contains the idiom, it implies that one cannot build a library on top of Juce, using similar implementations.

Juce is library that offers the Thread facility, it makes sense that the usage of the Thread object should not be restricted. I agree there are fundamentally unsafe uses of thread objects with static storage duration, just like there are unsafe uses of thread objects in general. But let’s not throw the baby out with the bath water, this is legitimate use-case.

Besides, I did almost all the work! This really should be a no-brainer…

Imagine how inconvenient your life would be if you had to remove all the static variables from Juce functions, and then provide this routine:

extern void shutdownJuce ();

Inside shutdownJuce() you manually clean up these objects that used to be decentralized as statics inside functions, and then require anyone who uses the Juce library to put a strategic call to shutdownJuce() in the right place in their code. This is the logical outcome of banning certain Juce objects with static storage duration.

X-Ryl669 even pointed this out (in a different context):

[quote=“X-Ryl669”]if you want a specific order of construction…there is nothing that prevent you to do the same for destruction:

// Call this when you need too, depending on your logic. void destructObj() { deleteAndZero(__getObj()); } [/quote]

Let me point out the boost::thread has no problem with this usage.