Dangling pointers in latest 1.53.59

In latest 1.53.59 tip, I am getting a “*** Dangling pointer deletion! Class: HeapBlock” assertion failure that did not occur with version 1.53.15 that I previously used.

This happens when a VSTPluginInstance is deleted. After the destructor ~VSTPluginInstance() comes to its end, the compiler deletes its members, one of them being “VSTMidiEventList midiEventsToSend”, whose destructor looks like this:

	~VSTMidiEventList()
	{
		freeEvents();
	}

The call to freeEvents() seems to clash in some way with the subsequent automatic deletion of the member “events”, which is exactly where the “dangling pointer” warning is issued: It happens in the destructor of HeapBlock. Looks like a duplicate deletion to me.

HeapBlock <VstEvents> events;

Why is this?

I should note this also happens when my plugin is scanned (Reaper OSX), but never actually played. It also happens when I instantiate and delete VSTPluginInstances programmatically inside my plugin.

The bug does not appear with the JuceDemoPlugin VST (when it is scanned or instantiated/deleted), so this is probably my own fault and I seem to have missed something really important that, by accident, simply did not show up in the previous Juce version(?)

Any idea where I should start looking?

Thanks in advance,
Andre

That class looks absolutely fine to me, I can’t see any bugs in it… Maybe the plugin itself is being deleted twice or something?

No. That was my first thought too, so I checked with breakpoints. I am not using HeapBlock in my plugin at all, so it must have been deleted twice somewhere in the library.

This didn’t happen with the previous Juce version I used (see above). It occured first after updating to 1.53.59, without making any changes to my plugin (as far as I remember). To make this absolute sure, I will try and build against the previous library again and see.

EDIT: Can confirm that it did not happen with 1.53.15.

Update: I put a beakpoint in ~HeapBlock() destructor and noted down all addresses of “this” that ocurred up to the “dangling” warning. I expected at least one address to show up twice, but that wasn’t the case. All the destructed HeapBlocks are unique. How would it be possible then to get below 0 with the instance counter?

Am I missing something?

It sounds to me like you might have some kind of cross-linking problem, e.g. multiple copies of the static counter used by HeapBlock, and be deleting the Heapblock from a different module than the one that created it. Are you using a statically linked juce library?

Unlikely, because I don’t use any HeapBlocks at all. Anyway, I will have a look if there is possibly a more hidden side effect. I am using the amalgamated build (no static lib).

Interestingly enough, when I compile with the old juce_LeakedObjectDetector.h in place (took it from 1.53.15), the error does no more show up! This may point to a bug in this template. You only changed the way the class name is reported (getLeakedObjectClassName()) plus the JUCE_LEAK_DETECTOR macro. One of these seems to cause heavy side effects(?)

Weird.

EDIT: Ah, and the new LeakCounter() constructor does have a throw() in its declaration, while the old one doesn’t.

No, there’s nothing wrong with the leak counter. You could try adding a few DBG statements to it to show when objects are added or removed - I do think this sounds more like a link error than anything else.

Hmm. The implication then would be that the previous juce_LeakedObjectDetector.h was broken or incomplete, because it did not show the warnings. Is that realistic?

I just wonder, because the plugin works fine when I use the old juce_LeakedObjectDetector.h (without changing anything else).

Is the leak counter thread safe? I mean, what if an instance is deleted and created at the same time?

Yes, it’s all done with atomics. I very much doubt whether there’s a bug in it, it’s a very simple class and would be pretty obvious if it wasn’t working. This to me sounds more like some kind of link error, i.e. where the statics are getting muddled up somehow. Have you tried to reproduce it, e.g. with the demo plugin?

I’m afraid it turned out that LeakedObjectDetector is not thread-safe. Here’s my fixed juce_LeakedObjectDetector.h. I don’t know what the cost of the added lock might be performance-wise, but it definitely does not work safely without it. With this change, I do no longer get the warnings. :smiley:

First I added this to LeakedObjectDetector:

    static CriticalSection& getLock() throw()
    {
        static CriticalSection lock;
        return lock;
    } 

Then made the following functions use the lock:

    LeakedObjectDetector ();
    LeakedObjectDetector (const LeakedObjectDetector&);
   ~LeakedObjectDetector();
   ~LeakCounter();

Not sure if I did it right, but I needed to include two forward declarations, too:

class CriticalSection;
class ScopedLock;

I have atached the changed file. (EDIT: to no avail. Neither .h nor .txt are allowed as attachments, sorry)

Andre

No, that’s just not correct! LeakedObjectDetector is definitely thread-safe, because its counter is an Atomic. If it’s not working, then the problem must lie with the atomic op, not with the leak detector class! What platform are you using?

The counter itself may be safe. But what about the getCounter() function, or the nested destruction of the LeakCounter member inside LeakedObjectDetector?

If the lock resolved the problem, then the lack of a lock was the problem. If it was a link issue, the lock would not have changed anything.

Mac Pro 8 core, OS X 10.5.8.

Flawed logic alert! Just because adding a lock makes your problem appear to go away, it doesn’t mean that the problem had anything to do with locking! Seriously, I’m pretty damn sure there’s nothing wrong with the LeakedObjectDetector class’s thread-safety. And even if it this was a race condition, the chances of you seeing a reproducible bug like that would be miniscule.

Like I said, it sounds like you may have managed to link multiple instances of that static counter variable into your app, in such a way that one instance is being used when incrementing, and another one when decrementing. Are you sure that ALL of your cpp files have the same flags set when they include juce.h ? If not, different modules may contain subtly different versions of the classes.

I am confident of this as well, and I have hammered the crap out of LeakedObjectDetector in my benchmark application.

Thanks guys for your input.

I must admit that a potential link issue with duplicate instances of the counter is a bit above my head. How would that be possible? All my source files are part of the same XCode project (originally created with the Jucer) and include the same “JuceHeader.h”. Apart from that, I do not change any Juce-related preprocessor macros in any files. There is only one target in the project. If I am missing something, I would much appreciate a hint where to look for.

For another test, I just downloaded the latest tip with Introjucer, cleaned everything, rebuilt the Juce project, rebuilt my plugin and the old warnings reappeared.

He he. Usually I would agree with you, however if there were duplicate counters, these would not stay balanced over time. Each time my plugin instance is deleted, there are exactly two warnings appearing in this order:

*** Dangling pointer deletion! Class: HeapBlock *** Leaked objects detected: 1 instance(s) of class HeapBlock

That is, the total number of HeapBlock instances is fine. With more instances created and deleted over time, the total counter remains balanced. When the host quits, the number of “leaked” instances matches the number of “dangling pointer deletions”.

I suspect it’s the sequence of destruction and construction that overlaps for some two objects (not necessarily my plugin and not necessarily the same classes) that involve HeapBlocks somewhere in their underpinnings. I think it is by pure accident that my plugin shows this, obviously rare, circumstance.

Possibly a new leak detector is in the middle of the process of being contructed (counter not yet incremented), while on another thread its deletion has already begun and happens to win the race, i.e. takes the counter temporarily below zero, because the construction takes longer. This is what the lock seems to cure.

Although my plugin is both constructed and destructed by the host, supposedly on the same thread.

Man. It’s been a while since I last scratched my head this often … :?:

Well, with tightly synchronised threads, for example when a semaphore fires on destruction of another object, race conditions can be very reproduceable. Construction and destruction can be lengthy procedures with a lot of inheritance and nested members involved (the leak detector only being one of them). There is plenty of time for conflicts.

I hope I can get this fixed, as I seem to distrust my lock solution the longer I think about it.

Update:

After experimenting a while, I agree that my lock solution may be useless. Other changes in the library also made the dangling warnings go away, possibly by pure chance, so this does not prove anthing. Flawed logic alert was correct :wink:

Still, it does prove that there is a race condition, because a few CPU cycles added or removed here and there make a difference!

My hypothesis is that some objects that involve HeapBlocks get deleted on a different thread than the one they were created on (Strings?), hence leading to a race condition (i.e. destruction starts /before/ construction is complete). This is the only explanation I can think of. If this proves to be true, it points to a very serious problem.

To narrow this down, I wrote a template class ThreadOwnershipTracker, that can be included with a macro just like the leak detector, and that reports when an object gets destructed on a foreign thread. Very helpful. It works fine for higher-level classes (tested with some of mine), but unfortunately I am having problems adding this to a basic class like HeapBlock (breaks things in the library).

I will report back, if I find out more.

Andre

I think you’re wasting your time in thinking it’s a race condition. Those leak detectors are thread-safe, end of story.

Personally, what I’d try would be to add some debug around the leak detector counters - and as well as printing the count value, print out the address of the counter variable. My guess is that you’ll see the HeapBlock’s counter appearing with more than one address, indicating that there are multiple versions of the leak detector in memory, with multiple counters.

Thanks so much Jules for your great support.

I thought it was a good idead to focus on HeapBlock, as this class is reporting the warnings. So I put this in the destructor of LeakedObjectCounter:

if (strcmp (getLeakedObjectClassName(), "HeapBlock") == 0) fprintf(stdout,"L %x\n", &getCounter());
and then got this console output:

L 12a46a08 L 12a485d0 L 12a47654 L 12a47654 L 12a47778 L 12a47778 L 12a47778 L 12a485e0 L 12a47778 L 18b635a0 L 18b635b0 ...

I checked my binary with the nm tool and found countless symbols associated with HeapBlock and LeakCounter. At least there seem to be multiple versions of the same symbol. I am not familiar with this output, so I don’t really have an idea what it means:

004b05f0 S __ZGVZN4juce20LeakedObjectDetectorINS_9HeapBlockI9VstEventsEEE10getCounterEvE7counter
004aeff0 S __ZGVZN4juce20LeakedObjectDetectorINS_9HeapBlockINS_9VstEventsEEEE10getCounterEvE7counter
00373392 t __ZN4juce20LeakedObjectDetectorINS_9HeapBlockI9VstEventsEEE10getCounterEv
00194644 t __ZN4juce20LeakedObjectDetectorINS_9HeapBlockINS_9VstEventsEEEE10getCounterEv
004b0620 S __ZZN4juce20LeakedObjectDetectorINS_9HeapBlockI9VstEventsEEE10getCounterEvE7counter
004af684 S __ZZN4juce20LeakedObjectDetectorINS_9HeapBlockINS_9VstEventsEEEE10getCounterEvE7counter

I understand there are different counters for HeapBlock depending on the class scope, although the leak detector warning reports a plain class name only. Confused.

Now, what can I do about this? What did I do wrong with my project?

The name of the leak detector member inserted into your class has the line number in the source file appended to it (via the ## preprocessor directive and the LINE predefine).

Try adding this to LeakedObjectDetector, it’s a shot in the dark but who knows?

    LeakedObjectDetector() throw()
    {
      if (++(getCounter().numObjects) == 0)
      {
            // If this goes off it means instruction re-ordering could have messed around with the counter
            DBG ("*** Instruction reordering! Class: " << getLeakedObjectClassName());
            jassertfalse;
      }
    }
    LeakedObjectDetector (const LeakedObjectDetector&) throw()
    {
      if (++(getCounter().numObjects) == 0)
      {
            // If this goes off it means instruction re-ordering could have messed around with the counter
            DBG ("*** Instruction reordering! Class: " << getLeakedObjectClassName());
            jassertfalse;
      }
    }

But let me re-iterate…I have hammered LeakedObjectDetector quite hard in my benchmark app. It creates 32 threads on a 12-core system and creates and destroys tons of objects using a lock-free allocator. The thing ran overnight and I didn’t hear a peep out of LeakedObjectDetector.

It is hard for me to imagine a bug in it.