Performance issue with OpenGL in multiple windows


#41

inquiring minds would love to know your current thought process with regard to the problem, and the issues you’re discovering that make it a not-so-easy fix…


#42

Here’s what I’ve run into debugging this issue for our plugins:

The main issue (I spoke with Jules about this a bit at ADC) is that when you’re a plugin you have to share the main thread with the DAW and all other plugins. JUCE’s OpenGL implementation runs on its own thread, which requires it to acquire a lock for the main thread when painting components.

This seems to be usable if you’re just using it to speed up the painting of your components, but many plugins use OpenGL to do complex, continuous rendering (either using complex vector paths or OpenGL calls). The plugins can’t synchronize these updates between themselves, so every new instance is a new ~60Hz callback trying to acquire a lock for the main thread if it’s using component painting. If it isn’t using it, there still may be some contention by them trying to swap buffers around the same time.

As far as I can tell, swapping the buffers on this dedicated OpenGL thread is problematic as well on certain platforms. We had issues even just running an OpenGLRenderer (so no main thread locking / component painting) on Windows until we disabled VSync. Then there’s also the fact that Device Contexts, HWNDs, etc. are thread-affine on Windows: in this situation swap buffers is called on a separate thread from where the DC & HWND were created with no synchronization.

Essentially for us it boiled down to:

  • An OpenGLContext with component painting enabled needs to acquire a main thread lock for every triggerRepaint(), but using setContinuousRepaint(true) will only require a lock when the component has repaint() called

  • SwapBuffers() on Windows caused similar slowdowns for us even if you used continuous repainting, but we saw improvements in our plugins by disabling VSync

  • Running the OpenGLContext without locking the main thread (e.g. setContinuousRepainting(true) or only using an OpenGLRenderer) is not thread safe on Windows, which gave us some situations where the whole application would stop working

If the host has the ability to run plugins as their own process (i.e. Reaper, Bitwig, etc.) then the issue shouldn’t occur. The performance issues went away for us in these hosts that ran the plugins in their own processes, but since there are many hosts that don’t support that we had to find a workaround

This thread also has some insight into the situation (even though it’s not specifically about JUCE):


#43

Thanks for the great research! This would mean any plugin using openGL would cause problem not just JUCE based ones, am I right? Using Byome (by Unfiltered Audio) and my openGL plugin will show this effect, but Byome was made in Juce. I’ll try to find a non Juce plugin to see if this is happening then also.


#44

found another openGL plugin and this is interesting… the more plugin windows are open, the higher the openGL framerate… maybe it’s some kind of compensation by the devs.




#45

The thing is that something has changed at some point because we didn’t see those report until around this year.
Maybe it was luck, but it is still strange.
Maybe JUCE’s OpenGL implementation didn’t run on its own thread at that time.


#46

That’s the thing, with JUCE running OpenGL in its own thread you run into these issues in plugins when you require thread synchronization. Our XT series plugins (LV2s, not made with JUCE) use an OpenGL context on the main thread and they’ve worked out fine by themselves. We actually did run into issues though on Windows with our JUCE plugins stalling the main thread if there was also an XT plugin open and the message manager wasn’t locked during the juce::OpenGLContext rendering. When debugging this always was caused by a problem with SwapBuffers(), and disabling vsync.

Our workaround basically involved the following steps:

  • Add an argument for juce::OpenGLContext::triggerRepaint() that sets the internal needsUpdate variable. If you call triggerRepaint() currently it will acquire the message manager even if there is nothing to update (i.e. the CachedComponentImage that’s used hasn’t had any regions invalidated since we’re not calling juce::Component::repaint()), and we had no way to update just the OpenGLRenderer without using continuous repaint.

  • Disabled vsync - on Windows we would still get performance issues during SwapBuffers() if vertical sync was enabled. As soon as we turned it off things were much better and we didn’t have frame dropping or stalling as I mentioned before, but of course then your rendering will encounter vertical tearing :frowning: note that this is why I can’t utilize continuous repaint, since the continuous repaint depends on a swap interval, otherwise it will just run as fast as possible which we didn’t need/want

This workaround may not apply to everyone, but it did work for us. Specifically our use case is our plugins have graph components that draw spectrum analyzers with a juce::OpenGLRenderer and then controls over top via Component painting. I chose to update these spectrum displays on a timer, which calls our modified triggerRepaint(0), while the painted controls only update when they’re interacted with (or changed via automation). This way we don’t have redundancies where the OpenGLContext locks the main thread when all we’re trying to do is update our spectrum display renderer.

If your plugins do require lots of component repainting however, then the problem seems to be unavoidable in hosts that don’t support running plugins as processes :confused:


#47

What do you use to view the FPS? I know a common method to measure it (in FRAPS and the like) is to hook into the OS buffer swap call, so it can keep track of how often its running over time. Since it writes the number over top of the window it seems it’s bound to the HWND or device context? If it’s not keeping track of actual FPS “per window” then you may be seeing the fact that the host process is calling swap buffers 2x, 4x, etc. as often as just having one instance of Ozone


#48

actually that is FRAPS measuring it.