Direct2D Part 3

Here’s a Direct2D demo app:

Direct2D Demo

I took a couple of the JUCE demos and added controls to switch between Direct2D & software mode, as well as the window opaque state and alpha value.

Note also the painting stats in the corner.

The JUCE intro screen draws the DNA-helix-spiral-waveform thing with a Path. I added an option to draw the same thing just filling ellipses and drawing lines. Here I’m trying to demonstrate that you may need to take different approaches to get the best performance out of the renderer. Here’s my average paint time for a fullscreen window on a 3440x1440 monitor:

Software mode, without Path: 12.1 msec
Software mode, with Path: 9 msec
Direct2D mode, without Path: 3 msec
Direct2D mode, with Path: 6.5 msec

Enjoy-

Matt

9 Likes

So that makes me wonder about performance on macOS with and without Path. If the opposite is true there, as I suspect, that means to get optimal performance, we have to wait split some paint functions between macOS (using paths) and Windows (using individual fill calls).

And individual calls have a drawback: overfill with any alpha value will look different between both versions.

Very nice.

Good question; I don’t think I’m qualified to speak to how the Mac renderer works. I looked briefly a while ago and the Apple docs recommend reusing GPU resources just like the DirectX docs.

Matt

Changes since the June 24, 2023:

  • Moved all drawing back to the message thread
  • Present buffers using a waitable swap chain
  • Transparent windows work
  • Direct2D content is rendered to a child window to work around how DXGI interacts with the JUCE message thread
  • Fixed drawing/filling open-ended Path
  • Fixed underlining glyphs
  • Changed minimum Windows version to 8.1
  • Clipping fixes
  • Fixed DPI scaling
  • Instrumented the D2D renderer with ETW trace logging
  • Created a dedicated component peer just for Direct2D mode
  • Optimizations:
    • Reduced number of calls to Direct2D device context SetTransform
    • Improved drawing glyph runs
    • Improved saving & restoring state
1 Like

Is this a hard requirement? Maybe it can be worked around? We have lots of customers unwilling to upgrade from Windows 7.

2 Likes

Thanks for the demo file.

To my surprise the results tend to differ from yours.

With Direct2D checked the paint duration is 51 ms and unchecked (which I assume corresponds to Software mode) the paint duration is about half of that, 23 ms. That’s the Graphics demo. The result for Juce Logo demo is about the same.

And there’s virtually no difference in the paint duration if Juce Paths is used or not, regardless if Direct2D is checked or not.

Tested on 11th Gen Intel(R) Core™ i7-11700K @ 3.60GHz, built-in graphics at 3840x2160 and 60Hz

Right now I’m relying on the waitable swap chain, which was first introduced with Windows 8.1 I didn’t realize there was still demand for Windows 7 support.

Windows 7 originally launched with Direct2D 1.0. I haven’t tried D2D 1.0 in a while, but as I recall it didn’t perform well, especially if you had multiple windows running in a single process.

Windows 7 SP1 supports Direct2D 1.1. Direct2D 1.1 does at least support flip mode swap chains, which is essential for good performance.

So do you need Windows 7 support or Windows 7 SP1 support?

Matt

Hello-

Thanks for taking the time to try out the demo app.

I suspect that in your case your CPU is easily outpacing your GPU and the software renderer. What is the specific model of your built-in graphics adapter?

I’ll try and reproduce your results here; I only have a limited number of computers to test on, but I think I have a Windows 10 laptop with integrated Intel graphics.

Matt

Thanks. I think the latest updates for Windows 7 can be assumed. So whatever SP was released last.

It’s the GPU embedded with the CPU, Intel UHD Graphics 750. It’s a small desktop computer with mini-itx motherboard. The absense of an external graphics car makes it quite quiet. And I’m on windows 11, with the latest upgrades.

The only numbers that gets smaller with Direct2D is the Time in the upper right corner of the Graphics Demo. It’s 0.3 ms with Direct2D and 1.2 without.

So for me (GTX 1060), it works just fine, except that initially, the window content gets drawn at 1/4 of the scale. So I have a full-size black rectangle, and in the top-left corner is the content fully animated, etc. When I resize the window, the content gets the correct scale, and also stays that way.

Now it crashes when I want to see the SVG demo, and I get only the checkerboard for the four image demos. Where do I need to put the files, and what are their names?

Thanks for trying the demo.

I reproduced the bug with the widow sizing. I’ll get that sorted out.

As far as the SVG crash, I should have built the demo differently with the resources embedded in the executable.

I’ll take care of both of those issues and repost the demo.

Matt

1 Like

Here’s an updated demo app:

Direct2D Demo 1.0.1

You’ll need some of the JUCE demo assets; I’m not sure if I’m allowed to distribute them and I figure everyone here has JUCE installed already, so just click the Examples button and select the path where you have your JUCE examples (e.g. c:\juce\examples).

Changes:

-Fixed an initial swap chain size if DPI scaling was > 100% at launch
-Fixed a case where Graphics::drawLine wasn’t handled by Direct2D
-Added widgets demo
-Added JUCE examples directory selector
-Fixed the bad SVG pointer access if no assets are available
-Changed the renderer control to a combo box for clarity

Matt

1 Like

I don’t think I have access to a computer with an Intel UHD 750, but I did try a laptop with an older Intel integrated GPU. I saw similar results.

That’s not entirely surprising; those GPUs do not have dedicated VRAM and are not exactly known for their performance. There’s probably still room for improvement though.

That being said - select the Graphics mode in the new demo, set your Examples directory, and pick either the RGB Tiled or ARGB Tiled modes. I see a dramatic visual difference between software and hardware modes.

Matt

Not the best test but on my Windows 11 VM hosted on my Mac everything is consistently faster. Typically around 2-4x faster than the software renderer. With the 4x improvements coming from the most taxing examples with lots of animated clipping and blending.

One question, in the Graphics Demo, is the Paint Duration supposed to match up withe JUCE meter in the top-right? For a couple of the demos (e.g. “Paths: Stroked”), I get the following:

Software renderer:
  - Paint duration: 9ms
  - JUCE time: 1.6ms
Direct2D renderer:
  - Paint duration: 14ms
  - JUCE time: 0.02ms

Which would indicate they’re better at direct aspects of painting?

Superb work!

1 Like

Now almost everything works as expected. The notable exception is that the “animate alpha” option has no effect. It works fine in software mode, so I assume it’s a bug in the D2D layer, which probably doesn’t set the alpha/opacity before submitting the draw calls.

Hi Dave-

Thanks for checking it out.

It’s perhaps somewhat confusing - the paint duration is measuring the entire time to paint and present the frame for an entire window. It’s the time taken by ComponentPeer:handlePaint plus the Direct2D overhead.

The JUCE meter in the top-right corner is part of the Graphics demo and is just measuring the time spent painting the graphics demo component (the checkerboard area).

I considered removing the JUCE meter since I thought it might muddy the waters but elected to leave it in since I figure everyone’s used to it.

Matt

Good spot, thanks. There were two separate issues.

Keep 'em coming!

Matt

1 Like

Replying to myself from the previous thread since this is the newer discussion.

I think I did what was needed to test DemoRunner with Direct2D but maybe missed something, wanted to check here that I got it right:

  • Built Projucer from your direct2d branch of your JUCE fork
  • Used it to open DemoRunner from same branch
  • Added the following two lines to Extra Preprocessor Definitions section for VS2022 exporter:

JUCE_DIRECT2D=1
JUCE_WAIT_FOR_VBLANK=0

The AudioVisualiserComponent is still stuttering, and seemingly only on Windows, when all else is equal (same exact machine running Linux and macOS has smooth scrolling). So it could be an issue in the implementation of the component itself, but I’d like to be sure before I investigate further.