Various OpenGL Fixes (Windows)

Related to wglCreateContextAttribsARB

@t0m @reuk

I looked into a few more WGL extensions and discovered some interesting things.
Probably information overload. But hey, feel free to implement these one at a time :sweat_smile:
It looks more like it actually is, I swear :joy:

If anyone has hints or ideas for other improvements, please share them here.


1. Context Release behavior
Reason: If there are context switches mid frame, the command pipeline will flush and trigger
parts of rendering while the MessageManager is still locked. This behavior can be turned off.

WGL_CONTEXT_RELEASE_BEHAVIOR_ARB can be used as an attribute.

WGL_CONTEXT_RELEASE_BEHAVIOR_FLUSH_ARB = default behavior. Flushes all commands on context switch.

WGL_CONTEXT_RELEASE_BEHAVIOR_NONE_ARB = Better, no implicit flush!

constexpr auto useFlushOnRelease = false; // Should be default!

const int attributes[] =
{
WGL_CONTEXT_MAJOR_VERSION_ARB, current.major,
WGL_CONTEXT_MINOR_VERSION_ARB, current.minor,
...
WGL_CONTEXT_RELEASE_BEHAVIOR_ARB, useFlushOnRelease ? WGL_CONTEXT_RELEASE_BEHAVIOR_FLUSH_ARB : WGL_CONTEXT_RELEASE_BEHAVIOR_NONE_ARB,
0
};

2. No Error Context
WGL_CONTEXT_OPENGL_NO_ERROR_ARB can be used as an attribute.

Reason: Should be used once it’s ensured that no errors are produced. Definitely in release! Further improves the context performance.

#if JUCE_DEBUG
constexpr auto noErrorChecking = false;
#else
constexpr auto noErrorChecking = true;
#endif

const int attributes[] =
{
WGL_CONTEXT_MAJOR_VERSION_ARB, current.major,
WGL_CONTEXT_MINOR_VERSION_ARB, current.minor,
...
WGL_CONTEXT_OPENGL_NO_ERROR_ARB, noErrorChecking ? GL_TRUE : GL_FALSE,
0
};

3. Debug Bit
WGL_CONTEXT_DEBUG_BIT_ARB can be used as a flag with WGL_CONTEXT_FLAGS_ARB.
If this bit is set, the debug message functionality of GL 4.3 and above can be used.

constexpr auto getFlags = [](bool forward, bool debug)
{
int flags = 0;

if (forward)
flags |= WGL_CONTEXT_FORWARD_COMPATIBLE_BIT_ARB;

if (debug)
flags |= WGL_CONTEXT_DEBUG_BIT_ARB;

return flags;
};

const auto useForwardCompatibility = false;

#if JUCE_DEBUG
constexpr auto useDebugging = true;
#else
constexpr auto useDebugging = false;
#endif

const int attributes[] =
{
WGL_CONTEXT_MAJOR_VERSION_ARB, current.major,
WGL_CONTEXT_MINOR_VERSION_ARB, current.minor,
...
WGL_CONTEXT_FLAGS_ARB, getFlags(useForwardCompatibility, useDebugging),
...
0
};

These bits can be queried later with glGet.

GLint flags;
glGetIntegerv(GL_CONTEXT_FLAGS, flags);

usesDebugging = flags & GL_CONTEXT_FLAG_DEBUG_BIT;
noErrorChecking = flags & GL_CONTEXT_FLAG_NO_ERROR_BIT;

Reason:
The debug bit is necessary to use the next thing:


4. Debug Messenger (GL >4.3)
It’s possible to attach a debug message callback to get much more precise error logging than JUCE_CHECK_OPENGL_ERROR. Especially since some GL calls in the JUCE codebase do not use the macro, and it’s always confusing to find the function which actually triggered an error.

static void KHRONOS_APIENTRY glDebugOutput(GLenum source, GLenum type, unsigned int id, GLenum severity, GLsizei length, const char* message, const void* userParam)
	{
		...
		// DBG(message...)
		...
		if (type == GL_DEBUG_TYPE_ERROR)
		{
			JUCE_BREAK_IN_DEBUGGER;
		}
	}

glEnable(GL_DEBUG_OUTPUT);
glEnable(GL_DEBUG_OUTPUT_SYNCHRONOUS); 
glDebugMessageCallback(glDebugOutput, nullptr);

Reference: LearnOpenGL - Debugging

Example Usage:

if (Capabilities::Language::getVersion() >= Version(4, 3))
{
	const Capabilities capabilities;
	if (capabilities.usesDebugging())
	{
		Output::enable();
		Synchronous::enable();

		setDefaultMessageHandler();

		handler.callback = this;
		glDebugMessageCallback(glDebugOutput, &handler);
	}
}

5. Core Profile 3.x Errors
Coincidentally, while using the debug messenger, I got this output:

***** API High : Error : Message ID (1280) :
***** GL_INVALID_ENUM error generated. Cannot enable <cap> in the current profile.

which was triggered by glEnable(GL_TEXTURE_2D);

Apparently this is an error on Core Profile above GL 3.0.
Enablement of GL_TEXTURE_2D was part of the fixed pipeline and removed later.


6. Error Checking & Performance
This is very important! Figuring this out was not obvious. It builds upon all previous things. Using
glError essentially flushes the pipeline. Think about it, normally all GL calls are async, but how does OpenGL figure out errors immediately after a function call? Exactly, for some parts of the state glError will implicitly execute parts of the submitted commands. This is problematic for performance. The pipeline should ideally flush after the Message Manager was unlocked. We want to avoid this at all cost, so there is no unnecessary wait for the async GPU processing.

But JUCE_CHECK_OPENGL_ERROR is not built into release you say? True. But there is still code that calls glError in release. Namely clearGLError(); which is called in multiple places during rendering. Essentially this means, instead of submitting as many commands as possible moving the rendering work AFTER the message manager is unlocked again, the pipeline is flushed all over the place, resulting in sluggish UIs on high render loads.

How to fix it:
Don’t rely on glError in release builds and make code independent of reported errors!
Especially since WGL_CONTEXT_OPENGL_NO_ERROR_ARB will report GL_NO_ERROR anyway, if activated.

Fortunately, as far as I know, there is only one part that relies on an error.

bool shouldUseCustomVAO() const
{
   #if JUCE_OPENGL_ES
return false;
   #else
clearGLError(); // <-- bad
GLint mask = 0;
glGetIntegerv (GL_CONTEXT_PROFILE_MASK, &mask);

// The context isn't aware of the profile mask, so it pre-dates the core profile
if (glGetError() == GL_INVALID_ENUM) // <-- functionality dependent on error? Oof!
return false;

// Also assumes a compatibility profile if the mask is completely empty for some reason
return (mask & (GLint) GL_CONTEXT_CORE_PROFILE_BIT) != 0;
   #endif
}

can be replaced with something like

bool OpenGLContext::CachedImage::shouldUseCustomVAO() const
{
    #if JUCE_OPENGL_ES
    return false;
    #else

    // The context isn't aware of the profile mask, so it pre-dates the core profile
    if (getOpenGLVersion() >= Version(3, 2))
    {
        GLint mask = 0;
        glGetIntegerv (GL_CONTEXT_PROFILE_MASK, &mask);

        // Also assumes a compatibility profile if the mask is completely empty for some reason
        return (mask & (GLint) GL_CONTEXT_CORE_PROFILE_BIT) != 0;
    }

    return false;

    #endif
}

7. Version Helper?
By the way. Just make the Version(major, minor) structure public. It’s too useful to hide it : )

How about something like…

struct OpenGLCapabilities
{
static Version getVersion() noexcept;

ContextReleaseBehavior getContextReleaseBehavior() const noexcept;

bool usesCoreProfile() const noexcept;

bool usesDebugging() const noexcept;

bool usesForwardCompatibility() const noexcept;

bool usesErrorChecking() const noexcept;

bool supportsNonPowerOfTwoTextures() const noexcept;
...
};

… which queries all features once on context init, so it’s performant to access them everywhere in render code, for alternate render paths, depending on the available extensions.


8. Fences
This is not a must have. A bonus for more stable rendering. I discovered fences (and semaphores) while exploring Vulkan. These concepts are all over DX12, Vulkan and Metal for CPU<->GPU sync.

GL 3.2 introduces glFenceSync(..), the only sync object available in GL. Essentially you insert a fence anywhere in the pipeline, and if it’s executed on GPU (in order), it will be signaled. It could be used like this:

bool OpenGLContext::CachedImage::renderFrame()
{
...
  while (! shouldExit())
  {
    doWorkWhileWaitingForLock (false);

    if (testFrameFence(context)) // <-- If there is an active frame fence, do other work
    {
      if (mmLock.retryLock())
        break;
    }
  }
...
addFrameFence(); // <-- Insert a fence here
glFlush();

context.swapBuffers();      
OpenGLContext::deactivateCurrentContext();

}

Which translates to: Insert a fence at the end of the frame. Swap buffers can then continue immediately. Later, starting the next frame, instead of immediately pushing new commands, overloading the pipeline and locking the MM, it can do other work, not using the context or MM lock.


Result:
This will make the OpenGL context, especially with multiple open plugin windows at 60 fps, more responsive.

3 Likes

Almost forgot it.

9. Use the latest context.
It’s understandable that we want the lowest specs for simple UIs. But why is the context request stuck on GL 3.2? The core profile is used. Proposal:

enum OpenGLVersion
    {
        defaultGLVersion = 0,
        openGL3_2,
        useLatest // <-- Add this. On Windows probably 4.6, on Mac 4.1
    };

If useLatest is used during context creation, we just request highest to lowest version until a valid context is returned. e.g.

constexpr Version versions[]{ {4, 6}, {4, 5}, {4, 4}, {4, 3}, {4, 2}, {4, 1}, {4, 0}, {3, 3}, {3, 2}, {3, 1}, {2, 1} };
// on mac better start with 4.1, since there is no higher version available

for (auto& current : versions)
{
	const int attributes[] =
	{
		WGL_CONTEXT_MAJOR_VERSION_ARB, current.major,
		WGL_CONTEXT_MINOR_VERSION_ARB, current.minor,
		WGL_CONTEXT_PROFILE_MASK_ARB,  getProfileMask(useCoreProfile, useCompatibilityProfile),
		WGL_CONTEXT_FLAGS_ARB, getFlags(useForwardCompatibility, useDebugging),
		WGL_CONTEXT_OPENGL_NO_ERROR_ARB, getErrorChecking(useErrorChecking),
		WGL_CONTEXT_RELEASE_BEHAVIOR_ARB, getFlushBehavior(useFlushOnRelease),
		0
	};

	const auto c = createContextWithAttributes(...
	if (c != nullptr)
		return c;
}
3 Likes

This is impressive feedback and very interesting !
Do you have an idea if this has made its way to the official JUCE code, or one would have to implement all of that in their fork ?