Strange OpenGL makeContextActive woes

Well, it doesn’t happen in theory, but it happened to me today in practice.
The fact is that I’m using an OpenGLComponent in a thread. The main thread perform a makeCurrentContextActive before entering its main loop. In the main loop, it only does “if (!isActiveContext()) makeCurrentContextActive()”.
Usually it works well. (see below for reason why I’m not doing makeCurrent…tive each thread loop)

In the message thread, I’ve started this thread. However, I’ve killed it before it entered the thread loop.
The thread never exits, because makeCurrentContextActive tries to get the message thread to perform something in its “callOnMessageThread” call, which is stalled because the message thread is waiting for the thread to exit!.

The message thread is then forced to kill the OpenGL thread.

So, even if I moved the make…active in the thread loop, it would still deadlock.
I think I’m missing something about how to use the make…active method in a thread.

Can you enlighten us ?

While thinking about this, I understand why it wasn’t happening if the make…tive is moved in the loop. I think it should happen anyway, but the context is cached in make…tive call, so to test this behaviour, the context should be deleted (which happen very sparsely), and the thread should be scheduled before it’s done creating.

So here’s a callstack to understand the bug:

OGL Thread:

juce::MessageManager::callFunctionOnMessageThread(void * (void *)* 0x0047c290 juce::Win32ComponentPeer::createWindowCallback(void *), void * 0x0a65b3e0) line 208 + 28 bytes
juce::callFunctionIfNotLocked(void * (void *)* 0x0047c290 juce::Win32ComponentPeer::createWindowCallback(void *), void * 0x0a65b3e0) line 365
juce::Win32ComponentPeer::Win32ComponentPeer(juce::Component * const 0x0a6c46a8, const int 0) line 384 + 14 bytes
juce::WindowedGLContext::createNativeWindow() line 388 + 46 bytes
juce::WindowedGLContext::WindowedGLContext(juce::Component * const 0x0a6c46a8, HGLRC__ * 0x00000000, const juce::OpenGLPixelFormat & {...}) line 94
juce::OpenGLComponent::createContext() line 505 + 98 bytes
juce::OpenGLComponent::makeCurrentContextActive() line 262 + 8 bytes

Message Thread:
juce::Thread::stopThread(const int 100) line 189 + 153 bytes
juce::Thread::~Thread() line 106
juce::VideoComponent::PresentationThread::~PresentationThread()

To reproduce it, simply create a thread with a OpenGL component, and write this:

void run ()
{
    openGL->makeCurrentContextActive();
}

// In message thread
// void buttonClicked:
{
     ScopedPointer<OGLThread> oglThread = new OGLThread(initializedOpenGLComponent);
     oglThread->startThread();
     oglThread = 0; // This deadlock, because makeCurrentContextActive is waiting for the message to receive a message, and this can't happen until we exit the method.
}

Tricky. The context needs to create a window, and for stupid win32 reasons, that must be done on the message thread, so there’s no way to avoid the callback.

Of course, it’ll only need to do that once, the first time you call makeCurrentContextActive(). So why not just call makeCurrentContextActive() on your message thread, before you start the thread?

Ok, I can partially solve this if I makeContextActive in the message thread (so in my thread constructor), and then, again in the thread.
However this could still deadlock when the context is deleted and need recreation.
Except for obvious OpenGL thread’s method, this also happen when componentPeerChanged is called.

Do you have more information about when such callback can be called ?
I know this is very unlikely that the componentPeerChanged is called, then the OGL thread enter the createContext again, and get scheduled (because of the sendmessage) and your message thread then again tries to kill the thread => deadlock.
This happened to me.

What if you add an event in the OGL component context creation code, then use postmessage instead of sendmessage, and wait for the event to be set (or thread exit event) before returning ?

The peer change will only happen if you remove the component from its parent window and put it in a child of a different window - so it’s pretty unlikely that you’d do that.

Ok, it’s clear. Thanks.
Maybe a small note in the OGL component documentation should state: Don’t change your component hierarchy if you’re using the context in a thread different from the message thread (or a jassert somewhere checking this, it should be easy, isContextActive will return false in the message thread when OGL used in a thread)

Yes, I should put that in somewhere. Annoying that I can’t think of a better workaround for it…

Well. I’m fighting with something crazy, again related to makeCurrentContextActive.
This time, I’m trying to use an OGL component in a dialog window in a single thread application.
The issue here happens in my OGL component init() method that needs to find out the max supported texture size.
In order to do so, it calls makeCurrentContextActive (without this, it fails all OpenGL’s calls). However, makeCurrentContextActive doesn’t succeed, because the component isn’t showing.
As I don’t master when the component is showing (I’m using it as the “contentComponent” of a DialogWindow::showModalDialog).
So I’ve put the init() method in visibilityChanged() callback, but even here the parent (DialogWindow) isn’t visible so makeCCA fails again.

I think putting the init() in paint() is a very ugly hack, what do I miss for a more elegant way ?

up

I guess you could watch both visibilityChanged and parentHierarchyChanged.

But I’d disagree that’s it’s a bad thing to initialise it in paint(). To initialise something at the last moment before you need it is just lazy instantiation, rather than a hack.

visibilityChanged happens while the parent (in my case, it’s the dialog window) is not visible.
I’d have sweared that parentHierarchyChanged didn’t work (I’ve tried it), but now it’s working when testing “isShowing()” (instead of “isVisible()” ?).

I think visibilityChanged is a bit misleading here as the component’s “visible” value might have changed, but the component isn’t visible anyway since the parent is not.
It’s not intuitive that the real “component becomes visible” callback is parentHierarchyChanged.

Is there a reason why addAndMakeVisble() does makeVisibleAndAdd() in reality ?

Concerning initialization in paint, it just doesn’t fit the “general” juce library paradigm, where initialisation usually happens in constructor. I agree lazy initialization is very convenient when it’s takes ages to initialize, but in my case, it clutters the paint method with something unrelated to painting so I prefer moving it elsewhere.

Also, and I think it’s probably important, since the OGL component isn’t visible (as OS point of view I mean) either when parentHierarchyChanged() is called, (but isShowing() returns true), and makeCurrentContextActive succeed, I wonder if the test “isShowing” in makeCurrentContextActive could be replaced by “isVisible && parentHasPeer”, so it would have worked at first in visibilityChanged ?
Ideally, we could initialize a OGL component in the constructor like any other component (and hide the Win32 hackery ?)

It just avoids making the parent immediately react to a change in its child visibility every time a child is added.

I think that could cause problems if you’re trying to hide an opengl comp - it needs to avoid creating the context if it shouldn’t actually be on-screen. In this case it might work, but I think it could mess up other situations.

Ok, I understand, but without speaking about implementation, I still think visibilityChanged should happen when the actual visibility changed and not the boolean flag (which could be a value BTW to which you can attach a listener).
It’s not intuitive that this callback is triggered, and the component claims it’s visible, but it’s not. Maybe Component could add a param to addChild(…, bool makeVisible = false) method, and remove the first setVisible ?

To actually hide it, either it had to be visible before (and the context is created, so MCCA returns the cached instance), either it wasn’t visible and the test “isVisible && …” would still prevent it. I’m probably misunderstanding your point (as I can’t figure out your example).

To send a visibility change message recursively to all subcomps when a component changes could be quite expensive, which is why I didn’t do it that way. True, though, I’ve often considered changing the way that works, as it’d probably be more useful to do it like that.

The example I was thinking of was that if you have a visible gl comp which is the child of an (intentionally) invisible parent, then your change would create a context for it, incorrectly.

That’s exactly what I’m doing in parentHierachyChanged! The parent is “visible” but not displayed yet, and MCCA works in that case.
If you have a context, it doesn’t mean your OGL component is actually displaying anything, right ?
Because currently, I’m creating a context to query the texture size, not to display anything (and nothing OGL-related is displayed either).

I think creating the context might actually involve creating a window on some platforms… can’t remember exactly.