Performance issue with OpenGL in multiple windows

matkatmusic · November 30, 2018, 6:12am

inquiring minds would love to know your current thought process with regard to the problem, and the issues you’re discovering that make it a not-so-easy fix…

alassandro · December 3, 2018, 7:23pm

Here’s what I’ve run into debugging this issue for our plugins:

The main issue (I spoke with Jules about this a bit at ADC) is that when you’re a plugin you have to share the main thread with the DAW and all other plugins. JUCE’s OpenGL implementation runs on its own thread, which requires it to acquire a lock for the main thread when painting components.

This seems to be usable if you’re just using it to speed up the painting of your components, but many plugins use OpenGL to do complex, continuous rendering (either using complex vector paths or OpenGL calls). The plugins can’t synchronize these updates between themselves, so every new instance is a new ~60Hz callback trying to acquire a lock for the main thread if it’s using component painting. If it isn’t using it, there still may be some contention by them trying to swap buffers around the same time.

As far as I can tell, swapping the buffers on this dedicated OpenGL thread is problematic as well on certain platforms. We had issues even just running an OpenGLRenderer (so no main thread locking / component painting) on Windows until we disabled VSync. Then there’s also the fact that Device Contexts, HWNDs, etc. are thread-affine on Windows: in this situation swap buffers is called on a separate thread from where the DC & HWND were created with no synchronization.

Essentially for us it boiled down to:

An OpenGLContext with component painting enabled needs to acquire a main thread lock for every triggerRepaint(), but using setContinuousRepaint(true) will only require a lock when the component has repaint() called
SwapBuffers() on Windows caused similar slowdowns for us even if you used continuous repainting, but we saw improvements in our plugins by disabling VSync
Running the OpenGLContext without locking the main thread (e.g. setContinuousRepainting(true) or only using an OpenGLRenderer) is not thread safe on Windows, which gave us some situations where the whole application would stop working

If the host has the ability to run plugins as their own process (i.e. Reaper, Bitwig, etc.) then the issue shouldn’t occur. The performance issues went away for us in these hosts that ran the plugins in their own processes, but since there are many hosts that don’t support that we had to find a workaround

This thread also has some insight into the situation (even though it’s not specifically about JUCE):

finecutbodies · December 4, 2018, 8:02am

Thanks for the great research! This would mean any plugin using openGL would cause problem not just JUCE based ones, am I right? Using Byome (by Unfiltered Audio) and my openGL plugin will show this effect, but Byome was made in Juce. I’ll try to find a non Juce plugin to see if this is happening then also.

finecutbodies · December 4, 2018, 8:18am

found another openGL plugin and this is interesting… the more plugin windows are open, the higher the openGL framerate… maybe it’s some kind of compensation by the devs.

otristan · December 4, 2018, 8:45am

The thing is that something has changed at some point because we didn’t see those report until around this year.
Maybe it was luck, but it is still strange.
Maybe JUCE’s OpenGL implementation didn’t run on its own thread at that time.

alassandro · December 4, 2018, 5:48pm

That’s the thing, with JUCE running OpenGL in its own thread you run into these issues in plugins when you require thread synchronization. Our XT series plugins (LV2s, not made with JUCE) use an OpenGL context on the main thread and they’ve worked out fine by themselves. We actually did run into issues though on Windows with our JUCE plugins stalling the main thread if there was also an XT plugin open and the message manager wasn’t locked during the juce::OpenGLContext rendering. When debugging this always was caused by a problem with SwapBuffers(), and disabling vsync.

Our workaround basically involved the following steps:

Add an argument for juce::OpenGLContext::triggerRepaint() that sets the internal needsUpdate variable. If you call triggerRepaint() currently it will acquire the message manager even if there is nothing to update (i.e. the CachedComponentImage that’s used hasn’t had any regions invalidated since we’re not calling juce::Component::repaint()), and we had no way to update just the OpenGLRenderer without using continuous repaint.
Disabled vsync - on Windows we would still get performance issues during SwapBuffers() if vertical sync was enabled. As soon as we turned it off things were much better and we didn’t have frame dropping or stalling as I mentioned before, but of course then your rendering will encounter vertical tearing note that this is why I can’t utilize continuous repaint, since the continuous repaint depends on a swap interval, otherwise it will just run as fast as possible which we didn’t need/want

This workaround may not apply to everyone, but it did work for us. Specifically our use case is our plugins have graph components that draw spectrum analyzers with a juce::OpenGLRenderer and then controls over top via Component painting. I chose to update these spectrum displays on a timer, which calls our modified triggerRepaint(0), while the painted controls only update when they’re interacted with (or changed via automation). This way we don’t have redundancies where the OpenGLContext locks the main thread when all we’re trying to do is update our spectrum display renderer.

If your plugins do require lots of component repainting however, then the problem seems to be unavoidable in hosts that don’t support running plugins as processes

alassandro · December 4, 2018, 6:45pm

What do you use to view the FPS? I know a common method to measure it (in FRAPS and the like) is to hook into the OS buffer swap call, so it can keep track of how often its running over time. Since it writes the number over top of the window it seems it’s bound to the HWND or device context? If it’s not keeping track of actual FPS “per window” then you may be seeing the fact that the host process is calling swap buffers 2x, 4x, etc. as often as just having one instance of Ozone

finecutbodies · December 4, 2018, 7:02pm

actually that is FRAPS measuring it.

finecutbodies · February 9, 2019, 12:24am

I just thought to bump this if there is any plans from @juce to solve it in the near future or better for us to figure out some workarounds?

clarke · February 11, 2019, 9:33am

This could be relevant:

jakemumu · February 15, 2021, 10:44pm

Was there every a supported solution to this issue? I’ve got a pretty graphics heavy app I’d like to use OpenGL on windows for, but I’m having this same issues with multiple windows open of the plugin making things unstable / unusable. @finecutbodies you ever find a good solution? I don’t want to disable OpenGL. Seems my choice right now is less performant graphics. Or more performant graphics with a big edgecase.

parawave · February 16, 2021, 4:30pm

I think, at least on windows, this problem can’t be fixed. Please correct me if I’m wrong. I wish there was a solution!

On Windows wgl is used to display OpenGL framebuffers to a windows surface/window.

The big problem is: OpenGL is not multithreaded. You can only use it if you claim the context. For this wglMakeCurrent is used. So naturally an OpenGL context is bound to ONE thread and only one thread can access it at a time.

In a plugin / multi window app environment this means, if one instance is using the context and another one wants to “makeCurrent”, it has to wait or otherwise synchronize it. How long does it take? Until the first instance uses makeCurrent(null).

It looks like this: The first instance claims it, does fancy OpenGLContext rendering and at some point wants to display it. SwapBuffers is called. If I understand this correctly. OpenGL does a lot of driver magic behind the scenes at this point. And the actual submit of most of the workload happens during SwapBuffers. So this call can take a long time! Even if you immediately call makeCurrent(null) after it, the thread will wait until SwapBuffers completes. Additionally to that, the makeCurrent is a very heavy operation.

In the end, the more complex your rendering is, and the more threads “fight” for the wglMakeCurrent, the slower your app gets.

Now what’s the solution? Say goodbye to OpenGL! Or somehow try to make the rendering single threaded, so there is no fight for a context. But then you have to synchronize your JUCE paint calls and do other dangerous and error prone stuff. Which in the end doesn’t even help because you just overload your SwapBuffers render loop and ruin your frame timings and animation smoothness.

Kind of hopeless. And if you look at the history of wgl, it’s not developed anymore. Microsoft probably focused their work on WPF, .NET and all the Windows 10 app stuff to offer better render performance. Alternative to all of this: Vulkan!

Topic		Replies	Views
Deadlock when using multiple OpenGLRenderer components as children of a non openGL component MacOSX and iOS	14	2642	November 20, 2022
OpenGLRenderer Massive Performance Hit With Multiple Instances General JUCE discussion	3	1317	October 19, 2018
OpenGLComponent takes up to 90% CPU General JUCE discussion	16	869	June 14, 2009
OpenGL + Threads... again General JUCE discussion	14	1799	May 12, 2017
JUCE window stalls for 10 sec when Internet is on General JUCE discussion	18	1929	May 1, 2018

Performance issue with OpenGL in multiple windows

Purchase

Discover

Learn

Support

About

Events

Performance issue with OpenGL in multiple windows

Related topics

Purchase

Discover

Learn

Support

About

Events