Mac, OpenGL? and EXC_BAD_ACCESS


#1

Hi,

I am new to Juce, and extremely keen to use it as the GUI for my S2PLOT library (http://astronomy.swin.edu.au/s2plot).

However I have run into a problem that is confounding me. I am receiving an EXC_BAD_ACCESS crash on entry to one of my functions that is called from the OpenGL render thread. Basically it appears that whatever the first executable line of my function is, the crash is reported there. But something much more intricate must be going on.

Thing is, I have ported and copied my code into a native iOS app (without Juce) without this error. I’ve pretty much done the same copy and insert with Juce, and I get this error. I believe I have everything set up (incl. the OpenGLContext following from JuceDemo).

I thought perhaps it was a c/c++ problem, but my code compiles fine inserted into Objective-C files (as it had to to work on iOS natively) and I have even verified that I can build S2PLOT natively (against GLUT) with g++ and clang++.

I’m at a bit of a loss. valgrind shows nothing of consequence except a few jumps that depend on uninitialised variables deep in the Juce code - typically these are ignorable. NSZombieEnabled finds nothing. Instruments find nothing. Yet the program keeps crashing. Well, it crashes around 80% of the time. By commenting out slews of my code I can get it to reliably not crash but I am not convinced the issue is with my code as I say it works on iOS, Mac GLUT, Linux GLUT, … it is only in the Juce framework / (or build environment?) that I get this failure.

I can’t post code to show this. It would be a massive post and S2PLOT is not open source yet but will be later this year.

So maybe someone else has experience tracking down such errors and can provide some general advice? Or has someone else seen EXC_BAD_ACCESS errors in Juce as well, especially as related to OpenGL?

thanks - David.


#2

When you say it’s one of your “functions”, do you mean “methods”… If so the first thing I’d suspect would be that you’re calling a method on a dangling or null pointer?

The second thing I’d investigate would be that you may have corrupted the stack just before calling the function, so that whatever happens inside it goes bang…


#3

I do mean functions not methods. S2PLOT is a C library, so it is a bunch of global functions with global variables (yes, I know that’s a bit old fashioned!).

The repaint of the OpenGL context results in my (global) “HandleDisplay” function being called, which does a few things, then calls my (global) drawView function, which ultimately calls my (global) “MakeGeometry” function. And it is this last one which crashes close to entry. All these functions operate on global variables which have all been initialised. I have tried with the functions literally included inline in the class implementation file for my Juce Component (and hence compiled with clang++ or g++ [I’ve tried both]) and a few hours ago I changed it so that all S2PLOT code is compiled as C-code with functions defined as extern “C” where necessary. Still I get the EXC_BAD_ACCESS in the same place, some ~60% of the time I run the application.

It’s most confusing as I am seriously changing around the ordering of code and the location of initialisation etc., so I would have expected a different function to tickle the problem. Yet it persists in the one function MakeGeometry. Very frustrating.

I thought I might try doing a Linux build and see what happens. Perhaps it is some obscure issue with C++ and the way it handles global and initialisation and global C functions versus methods, but I still think I would have hit this issue already without Juce by building S2PLOT with g++, clang++, etc…

thanks for the suggestions, I’ll see what I can figure out, but if you have any other thoughts, would be good to hear!

  • David.

#4

So, further developments:

  1. I built on Linux. This highlighted a couple of issues now resolved, and I have S2PLOT render reliably working in a Juce component.

  2. With these minor issues fixed, the build on Mac, when it actually doesn’t crash (say, 20% of the time) renders as it should. But still 4 times out of 5, it crashes on entry to one of my functions.

I am wondering if we are hitting a threading and OpenGL issue on Mac OS X. My cursory understanding from reading the doxygen docs for the Juce OpenGLContext is that the OpenGLRenderer runs in its own thread. This is potentially bad on Mac prior to Lion: only Lion and Mountain Lion support OpenGL anywhere other than the main thread. I do have a Lion laptop but before I try that out, any further ideas / comments on this possibility?

thanks - David.


#5

That sounds extremely unlikely to me… Citation needed!


#6

Link: https://developer.apple.com/library/mac/ipad/#documentation/graphicsimaging/conceptual/OpenGL-MacProgGuide/opengl_threading/opengl_threading.html

Down about two pages to “OpenGL restricts each context to a single thread”

This is not precisely what I had in mind, but In my experience other GL loop controllers e g glutMainLoop can only run on the main thread on OS X prior to Lion and at WWDC 2011 Apple announced support for OpenGL on threads other than the NSapplication main thread (in Lion).

Maybe this is a red herring. The juce demo opengl stuff works.

What about the stack size for the sub threads though? Somewhere yesterday I found a reference implying that NS subthreads got only 512kB of stack each. Maybe I need to figure out how to increase this rather than just the total stack…?

David.


#7

Sure, it’s normal for GL contexts to only be single-threaded, but there’s no reason to think that it’d need to be the main thread. And I doubt whether you’re having stack size problems.

Sounds to me like your problem is probably just a normal c++ memory corruption bug, probably nothing to do with GL.


#8

You’ve misunderstood something they were saying. Mac has allowed OpenGL threading for years and years.

At a guess - they probably made multi-threaded OpenGL standard, meaning that the system would automagically use multiple threads to issue the OpenGL commands. It was a switchable option, that wasn’t necessarily recommended.

Making and using OpenGL contexts on threads has been possible as long as I remember. GLUT et al probably avoid it since they also do keyboard input and have no provisions for inter-thread protection.

Bruce


#9

Point taken. But now the status is that:

  • s2plot builds with gcc, g++ and clang++ on Mac OS X and doesn’t crash

  • s2plot builds with gcc and g++ on Linux and doesn’t crash

  • s2plot builds within Juce with g++ on Linux and doesn’t crash

It is only the Mac OS X build of s2plot within Juce (with either g++ or clang++) that has this execution error.

I agree it looks like an overwrite of allocated memory and/or use of uninitialised memory, but none of the debugging tools (Instruments, valgrind, …) on Mac or Linux have identified any culprit apart from some innocuous looking things in valgrind on the Mac OS X s2plot+juce combo which I guess I might look back into.

And I’ll try a build on Lion just to see…

  • David.

#10

Ok, so I thought I should at least post a selection of the valgrind errors / messages related to Juce in the hopes it might be useful. The common elements in most of these valgrind jump issues are:

juce::OpenGLContext::NativeContext::NativeContext(juce::Component&, juce::OpenGLPixelFormat const&, juce::OpenGLContext::NativeContext const*)
juce::OpenGLContext::CachedImage::CachedImage(juce::OpenGLContext&, juce::Component&, juce::OpenGLPixelFormat const&, juce::OpenGLContext const*)
juce::OpenGLContext::Attachment::attach() 
juce::OpenGLContext::Attachment::componentVisibilityChanged() 

Perhaps the full traces below are useful?

JUCE v2.0.21
==42558== Conditional jump or move depends on uninitialised value(s)
==42558==    at 0x19A1874A: gldDestroyVertexArray (in /System/Library/Extensions/ATIRadeonX2000GLDriver.bundle/Contents/MacOS/ATIRadeonX2000GLDriver)
==42558==    by 0x19A19BA7: gldDestroyVertexArray (in /System/Library/Extensions/ATIRadeonX2000GLDriver.bundle/Contents/MacOS/ATIRadeonX2000GLDriver)
==42558==    by 0x19A19DEC: gldDestroyVertexArray (in /System/Library/Extensions/ATIRadeonX2000GLDriver.bundle/Contents/MacOS/ATIRadeonX2000GLDriver)
==42558==    by 0x19A45502: gldUpdateDispatch (in /System/Library/Extensions/ATIRadeonX2000GLDriver.bundle/Contents/MacOS/ATIRadeonX2000GLDriver)
==42558==    by 0x19A05617: gldCreateContext (in /System/Library/Extensions/ATIRadeonX2000GLDriver.bundle/Contents/MacOS/ATIRadeonX2000GLDriver)
==42558==    by 0x1983D835: gliCreateContext (in /System/Library/Frameworks/OpenGL.framework/Versions/A/Resources/GLEngine.bundle/GLEngine)
==42558==    by 0x96A99: CGLCreateContext (in /System/Library/Frameworks/OpenGL.framework/Versions/A/OpenGL)
==42558==    by 0x1345E85: -[NSOpenGLContext initWithFormat:shareContext:] (in /System/Library/Frameworks/AppKit.framework/Versions/C/AppKit)
==42558==    by 0x1005189DB: juce::OpenGLContext::NativeContext::NativeContext(juce::Component&, juce::OpenGLPixelFormat const&, juce::OpenGLContext::NativeContext const*) (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)
==42558==    by 0x10052D735: juce::OpenGLContext::CachedImage::CachedImage(juce::OpenGLContext&, juce::Component&, juce::OpenGLPixelFormat const&, juce::OpenGLContext const*) (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)
==42558==    by 0x10052EDF4: juce::OpenGLContext::Attachment::attach() (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)
==42558==    by 0x100516DC3: juce::OpenGLContext::Attachment::componentVisibilityChanged() (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)

...

==42558== Conditional jump or move depends on uninitialised value(s)
==42558==    at 0x1984216C: gleUpdateState (in /System/Library/Frameworks/OpenGL.framework/Versions/A/Resources/GLEngine.bundle/GLEngine)
==42558==    by 0x1983EA35: gleInitializeContext (in /System/Library/Frameworks/OpenGL.framework/Versions/A/Resources/GLEngine.bundle/GLEngine)
==42558==    by 0x1983D99F: gliCreateContext (in /System/Library/Frameworks/OpenGL.framework/Versions/A/Resources/GLEngine.bundle/GLEngine)
==42558==    by 0x96A99: CGLCreateContext (in /System/Library/Frameworks/OpenGL.framework/Versions/A/OpenGL)
==42558==    by 0x1345E85: -[NSOpenGLContext initWithFormat:shareContext:] (in /System/Library/Frameworks/AppKit.framework/Versions/C/AppKit)
==42558==    by 0x1005189DB: juce::OpenGLContext::NativeContext::NativeContext(juce::Component&, juce::OpenGLPixelFormat const&, juce::OpenGLContext::NativeContext const*) (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)
==42558==    by 0x10052D735: juce::OpenGLContext::CachedImage::CachedImage(juce::OpenGLContext&, juce::Component&, juce::OpenGLPixelFormat const&, juce::OpenGLContext const*) (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)
==42558==    by 0x10052EDF4: juce::OpenGLContext::Attachment::attach() (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)
==42558==    by 0x100516DC3: juce::OpenGLContext::Attachment::componentVisibilityChanged() (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)
==42558==    by 0x10029B05C: juce::ComponentMovementWatcher::componentVisibilityChanged(juce::Component&) (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)
==42558==    by 0x100451ECF: void juce::ListenerList<juce::ComponentListener, juce::Array<juce::ComponentListener*, juce::DummyCriticalSection> >::callChecked<juce::Component::BailOutChecker, juce::Component&>(juce::Component::BailOutChecker const&, void (juce::ComponentListener::*)(juce::Component&), juce::TypeHelpers::ParameterType<juce::Component&>::type) (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)
==42558==    by 0x1002A1F54: juce::Component::sendVisibilityChangeMessage() (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)

...

==42558== Conditional jump or move depends on uninitialised value(s)
==42558==    at 0x1986489A: glVertexAttribPointerARB_Exec (in /System/Library/Frameworks/OpenGL.framework/Versions/A/Resources/GLEngine.bundle/GLEngine)
==42558==    by 0x6116639: glVertexAttribPointer (in /System/Library/Frameworks/OpenGL.framework/Versions/A/Libraries/libGL.dylib)
==42558==    by 0x100513DC8: juce::OpenGLExtensionFunctions::glVertexAttribPointer(unsigned int, int, unsigned int, unsigned char, int, void const*) (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)
==42558==    by 0x100507F35: juce::OpenGLContext::copyTexture(juce::Rectangle<int> const&, juce::Rectangle<int> const&, int, int) (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)
==42558==    by 0x10053006F: juce::OpenGLContext::CachedImage::paintComponent() (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)
==42558==    by 0x100530360: juce::OpenGLContext::CachedImage::renderFrame() (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)
==42558==    by 0x10052EC85: juce::OpenGLContext::CachedImage::run() (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)
==42558==    by 0x10052ED74: non-virtual thunk to juce::OpenGLContext::CachedImage::run() (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)
==42558==    by 0x100094049: juce::Thread::threadEntryPoint() (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)
==42558==    by 0x1000940F7: juce::juce_threadEntryPoint(void*) (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)
==42558==    by 0x100083D17: threadEntryProc (in /Users/barbie/Desktop/s2juce.app/Contents/MacOS/s2juce)
==42558==    by 0x91AFD5: _pthread_start (in /usr/lib/libSystem.B.dylib)

#11

That looks like a bug deep inside in your ATI driver, but is probably benign.

If I was debugging this, I’d be thinking about structure packing differences, perhaps header files being used with different flags set in different compile units? (e.g. in many juce classes, if you have debug enabled, the classes will be laid out in memory differently due to leak detector code, so if you include a header file somewhere with debugging disabled, and the same header somewhere else with debugging disabled, you’ll have two clashing models of the same class, with nasty results).


#12

Hi Jules,

I agree with your suggestion. However I have split the code to the point that:

  1. all S2PLOT-related code is compiled and built in a single file “s2layer.c” within the Juce project.

  2. only five global functions in s2layer.c are declared and called from the Juce class I am writing.

  3. there are no header files of mine included in the Juce class, just five declarations of functions that take ordinary C arguments (no special structs / defines)

  4. there is no overlap in the preprocessor macros or defines in Juce and those in S2PLOT.

  5. the identical setup (s2layer.c plus five function declarations) works for a Cocoa based application.

So I could keep hacking away at this, but I have seen nothing yet that even hints at what could be going on. It is either a bug in my code (possible), a bug in Juce (unlikely), or some extreme peculiarity of calling global C functions from C++ classes that I have not seen before and face little chance of identifying / solving.

Thanks for all the help and suggestions. I wish I had the skill / temerity to solve this.

  • David.

#13

It may be time for you to resort to the “remove more and more bits of code until it stops crashing” approach…