OpenGL maxing out CPU on some machines in DemoRunner

We’ve had a report from our external beta testers that our JUCE with OpenGL application is causing some cores to max out and causing very high clock multiplier numbers. We have been able to replicate the same issue using the DemoRunner app shipped with JUCE v6.1.2.

We have since determined that this seems to only happening on Windows machines (10 or 11), and only then when they are fitted with (non-integrated) Nvidia graphics cards. Laptops with integrated graphics cards do not have this issue.

In this screenshot you can see the clock multiplier close to topping out at x47 and some very busy CPU cores whilst we are running the OpenGAppDemo.

When we switch back to another demo in the list without closing the app, we see a significant drop in both.

Is this a known or expected issue? Is there a workaround? It feels like a driver issue to me, so maybe this needs relaying on to Nvidia themselves, but my google-fu found no other similar reports from other non-juce apps.

I can’t remember what’s in that demo, but in general, the juce method of OpenGL is a little counter productive. If you use the juce graphics functions, or the paint routine in a component, you are basically rendering on the cpu into a frame buffer which then gets passed over to OpenGL. The only time I see improvements in performance when using OpenGL is when I don’t use paint routines at all and go entirely low level vertex buffers. YMMV

It has looked like (at least for me) using OpenGL for the Juce Components doesn’t really help in terms of CPU/GPU/energy efficiency, at least on Windows. You may be able to get higher frame rates than with just purely CPU based rendering, though, but you will pay the price of the CPU/GPU working hard and raising the system temperatures.

@Fandusss & @xenakios
thats right, but the current issue can be of a different nature

Did you try to profile the release build? Maybe you will see where the most CPU time is used.
(Just a wild guess, maybe the render renders unnecessary frames without waiting for the v-sync, I had something like this in a parallels vm, which caused a similar phenomenon)
I’m myself no OpenGL expert, but maybe someone can give better hints with this information.

We’ve found that using the OpenGL context can give us some improvements over the standard graphics pipeline for some of our applications.

The question of OpenGLs overall performance isn’t really at hand here, what’s in question here is why we are seeing cpu cores maxing out and large clock multipliers on machines with objectively decent graphics cards, but no maxing out at all on laptops with integrated graphics cards or our standard dev machines with whatever lower end cards IT has allowed us?. Both setups produce the same performance, same framerates etc.

I expect this lies at the graphics card driver level, I’m not expecting profiling will show us much. I will attempt to get that done though, however the machine we have in-house that can repro this is not a dev machine nor from our department so it’s going to take some politics and then setup to get to a position where we can profile the issue.

I’m totally pulling this out of my ass here, but perhaps it could be something to do with the majority of the cpu time is spent rendering to a frame buffer, and then moving that data to the gpu. Maybe integrated gpu’s are more efficient in this copy step, and other gpu’s have various bottlenecks to push through. I’m not sure how you’d go about testing that, but it might not be too many man hours to rule it out.

1 Like

Further testing has shown that this is a more widespread ‘Nvidia with OpenGL’ issue. Using a non-JUCE opengl test application ‘openglchecker’ we are seeing the same results again. The next step for us is to report it to Nvidia and then bask in the glory of having deflected another bug as ‘not our responsibility’. Shields up!