OpenGL deactivateCurrentContext Windows CPU

I actually tested it by just commenting out the OpenGLContext::deactivateCurrentContext(); after
context.swapBuffers(); and by adding a check to context.makeActive(), skipping the makeCurrent if it’s already active.

Sadly, there isn’t much of a difference. The thing is, wglMakeCurrent() is somehow shrouded in fog.
It’s not clear what exactly happens, since it triggers all kinds of driver level stuff. It implicitly flushes the previously submitted commands (command buffers), so it’s probably more realistic to glFlush() or glFinish() before measuring. And measuring it in debug is also not a good idea. It’s complicated.

I mentioned this in BR: OpenGL is using a lot of CPU - #7 by parawave

If you test this with render code that actually does something, not empty, the used CPU by the context switch will balance itself out. And it’s connected to V-Sync (swap interval) and frametime.

If someone wants to dive into this, I recommend looking at glfw, SDL or SFML and look at how they set up the wgl calls. Perhaps there is some magic GL extension that solves this?

Anyway. All of this could improve it ‘a bit’. Probably. But in the end, if you really want to boost the render performance, this will hardly change anything.

The real bottleneck sits in JUCE/juce_OpenGLGraphicsContext.cpp at master · juce-framework/JUCE · GitHub

L885 : struct EdgeTableRenderer and it’s use

template <typename IteratorType>
void add (const IteratorType& et, PixelARGB colour)
{
EdgeTableRenderer<ShaderQuadQueue> etr (*this, colour);
et.iterate (etr);
}

This is used by path and image drawing. Converting scanlines to pixel quads. The actual generation of the edge table is performed by the same code used by the SoftwarRenderer, namely RenderingHelpers::SavedStateBase in

The IteratorType can be of type:

using EdgeTableRegionType = typename ClipRegions<SavedStateType>::EdgeTableRegion;

using RectangleListRegionType = typename ClipRegions<SavedStateType>::RectangleListRegion;

Basically the RectangleList will end up with a single quad for a fill. But for everything else. Paths, transformed images, gradients and stuff, or as soon as clipping is involved. The expensive EdgeTables will be used.

All of this is done on CPU, not multithread and without caching. I have to say, it’s a really cool and elegant solution. But it’s not suited for GL. No blame here. It’s obviously not a trivial thing. Even Skia does all kind of hacky stuff with different implementations to give minor boost for the worst case scenarios.

I wonder if it’s possible to create some kind of intermediate representation. Skip the scanline edgetable rendering on CPU and do all of this in a shader. Perphas vertex or geometry shader that creates the edge tabe on the fly. This would give a massive boost.

2 Likes