Direct2D Part Deux : 2 Direct 2 Furious

Quick update - I’ve learned a lot more about Direct2D and DXGI multithreading and realized I needed to rework how the message thread and the render thread work together. I’ve also spent quite a bit of time profiling (thumbs up for Intel Vtune) and making performance improvements.

Windows 10 is running well, but I’m still seeing occasional stutters on Windows 11 that have stumped me. I suspect it’s because of this:

https://learn.microsoft.com/en-us/windows/win32/direct3darticles/dxgi-best-practices#multithreading-and-dxgi

I’ve got a couple of potential solutions. Stay tuned.

Matt

6 Likes

Quick update - I’ve been stuck on two multi-threading issues for the last several weeks. I think I’ve found solutions to both of them, but I’ve had to do some significant refactoring.

My plan was to have the render thread and the message thread run in parallel at full speed; the render thread would be drawing and presenting one frame while the message thread prepares the next. But - Direct2D factories have an internal lock that must be locked while accessing the factory or presenting the swap chain. That essentially makes the whole thing single-threaded.

Direct2D multithreading

So - don’t access Direct2D or DXGI on the message thread and instead queue up a list of drawing commands on the message thread and push a frame’s worth of commands over to the render thread. That’s fine - but I kept seeing annoying spikes where most frames would render very quickly (3 ms or so) with spikes up to 12 ms or more. This has been driving me crazy.

Turns out that the DXGI swap chain will sometimes make a synchronous call to the message thread of the window being painted, meaning that drawing and presenting will stall until the message thread responds. This is very annoying. The JUCE message thread has a lot going on and can get saturated, especially if it’s busy taking a few milliseconds to prepare the next frame, which in turn hiccups the render thread.

DXGI multithreading

The only real solution I can see is to have a separate message loop on a separate thread that is dedicated to Direct2D and will always respond immediately. That separate message loop is dedicated to a child window that covers the main window entirely and is just used for D2D rendering.

The good news is that all seems to work just fine and solves the two performance issues. I’m still putting it all back together.

Multithreading is aggravating.

Matt

8 Likes

Hi everyone-

It’s been quite a journey with plenty of blind alleys and false starts, but I believe I have something that is feature-complete and works well.

https://github.com/mattgonzalez/JUCE/tree/direct2d

Window resizing works, DPI scaling works, transparent windows work, VBlankAttachment works again, and I’ve made lots of internal optimizations and improvements.

I’ve also instrumented everything with ETW event tracing.

Just define JUCE_DIRECT2D=1 in your application and enable Direct2D in your main window constructor and you’re good to go; Direct2D mode is still off by default.

MainWindow (const juce::String& name, juce::Component* c, juce::JUCEApplication& a)
    : DocumentWindow (name, juce::Colours::black, juce::DocumentWindow::allButtons),
      app (a)
{
    setUsingNativeTitleBar (false);
    setOpaque (true);
    setAlpha((float) 1.0f);
    setContentOwned (c, true);
    setDropShadowEnabled (false);

    setResizable (true, false);
    centreWithSize(getWidth(), getHeight());

#if JUCE_DIRECT2D
    if (auto peer = getPeer())
    {
        //
        // Enable Direct2D
        //
        peer->setCurrentRenderingEngine (1);
#endif
    }

    setVisible (true);
}

I’ll post a technical writeup and put together some example apps.

Matt

6 Likes

As far as known issues - I’m still seeing some hitching on my Windows 11 machine that I’d like to sort out.

You also may run into performance issues with multiple windows open at once from the same app. This will probably require some sort of central dispatcher that integrates the JUCE VBlank thread and waiting for the swap chains for each window.

Matt

1 Like

This feels like the right time for me to give this a test. I have seen hitching on Windows using stock engine (in the built-in component AudioVisualiserComponent, as used in the AudioLatencyDemo in the DemoRunner), so I want to have an alternative back end to dig deeper and try to understand why it is stuttering.