Vulkan Modules for JUCE | 0.5.0 beta now on Github!

Initially this was just an idea for a potential module in the future. Then it became an experimental project, leading to further discussion in the initial feature request topic in this forum. After some users have expressed their interest in it, I decided to make it open source. Now it’s here… unfortunately far from ‘production ready’. But see for yourself!


Important

At the current time only Windows is supported. Part of the reason why this project is made open source, is the hope that interested JUCE users will contribute and test the implementation on other platforms.

Although it should ‘just work’ on Windows, it wasn’t really tested on many devices, therefore the tag 0.5.0-beta. Any feedback is definitely appreciated. Let’s make it a 1.0.0 together!

Opinions, ideas, interested in contributing? Leave a message here!


The original feature request and rambling thread - FR: JUCE Vulkan

24 Likes

Nice! Not sure if you have already, but I suggest joining the Slack for KhronosDevs and posting this in the #vulkan channel. You might be able to pull in some interest there as well.

3 Likes

@parawave Thanks for your message. Sorry for my delayed answer.
For the MacOs/iOS implementation it is necessary to include the Metal Framework as well as to add a CAMetalLayer to the Native View (Juce don’t use currently Metal).
The spv pre-compiled shader should be separated in different folders according to their native platform.
I see you are using images in your LowLevelGraphicsContext (the same as the OpenGL JUCE implementation). The correct way would be to implement the graphic primitives (see f.e. juce_mac_CoreGraphicsContext.mm or juce_win32_Direct2DGraphicsContext ). I know it would be necessary to make others changes in the JUCE Framework, but otherwise you have a high memory consumption and unnecessary CPU overhead.
We started to write our own Vulkan implementation, we are implementing there the above mentioned graphic primitives.
If you want I can add the MacOS/iOS, Linux and if possible Android native implementation for your module and compile the shaders.
Regards

2 Likes

By the way, I used the ‘glslc’ compiler to generate the SPV Code and then the JUCE binary builder. One thing to note here: It is possible to embed the GLSL compiler to your application to perform this task at runtime. But it seems the SDK only provides the dynamic runtime versions of the compiler. Since many audio plugins use the static runtime, I avoided adding it, to ease the setup.

Anyway, I thought the generated SPV (bytecode?) is platform independent!? Isn’t this the whole purpose of this intermediate representation? So is it really necessary for Android & Linux? Of course, MoltenVK might be another thing, it requires a translation from SPV bytecode to the Metal shader code equivalent, right? Maybe MoltenVK does this under the hood?

About this. If we consider a call like this: g.drawImageAt(image, 0, 0);
the underlying juce::Image implementation can’t be known here. We always have to move it to GPU memory first. Of course it can be cached!

Currently the implementation looks at the reference counted Image/Bitmap PixelData and uploads it to a VulkanImage (backed by the general Vulkan memory buffer). It’s only uploaded if the PixelData changes. For repeated calls it is cached. So where is the high memory consumption and CPU overhead compared to another graphic primitive?

Maybe we talk about different things. Note - The VulkanImageType in the module is optional, to directly use an image as render target. It is not used to in the LowLevelGraphicsContext.

How would a platform specific primitive help with this issue? Isn’t the whole purpose of the Vulkan API that you only use the Memory Buffer objects for image data? Even when using MoltenVK I expect that everything uses the same API calls and memory buffer objects. Not the metal version of it.

The only part that is platform specific is the OS surface creation (in Windows HWnd and HInstance) provided to the swapchain.

Not sure about the shaders, but it would be nice to have the surface implementation for every platform. Thanks for the offer. I’m really unfamiliar with MacOS windowing and don’t have the experience with dynamic MacOS libs to get MoltenVK running here.

@parawave
The SPIR-V files are not platform independent, the files contained in your repository are compiled for DirectX (windows) and they are not usable by others Operating Systems. The folllowing pic offers an overview of the targets:


Of course it is possible to embed the glsc compiler for the distribution of your app / plugin. The shaders only need to be compiled once, not every time the application is run. The compilation at runtime is of course an option, but you can distribute your shaders with your application or embed them as well. For development it is necessary to compile the shaders only every time you change them.
For this reason you should either provide advice or distribute a precompiled version of the shaders in your repository.
There is a Vulkan Framework and SDK for Mac OS currently available:
Vulkan SDK for MacOS
Molten VK is only mandatory for iOS or if you want to link statically for distribution in MacOS.
About the geometric/graphic primitives:
You are doing the following steps (the same as in the JUCE OpenGL Context):

  • The graphic class is initialized with an Image as target
  • Render the whole output of the Graphics class to an Image (the image is in memory)
  • Clip the image content (this is done by the cpu, as well as other operations, see Graphics class)
  • Transfer the image to Vulkan and render it.
    We take a different approach inheriting from LowLevelGraphicContext and creating a VulkanLowLevelContext class. We implement the basic geometric elements there (path, rectangles, etc.), which leads to the following scheme:
  • Initialize Graphic class with a VulkanLowlevel Context
  • store the vertex information of the primitives or the changes (affine transformations)
  • transfer the necessary information to the shaders and render it.
    This is a simplified scheme. Of course we are doing more operations. This procedure is very fast and you are using less memory compared to the Juce OpenGL Context.
    As you pointed out is not possible to implement the context in this way as a separate module, it has to be initialized f.e. inside from juice_gui_basics/native/juce_win/android/linux Windowing.
    Actually Juce has not an implementation of the geometric primitives for each OS, for Linux and Android they are rendering Images as described before. In large applications this leads to a very high memory consumption (f.e. in Linux you are using at least 3 times more memory as in MacOS, Juce_Android is almost unusable for big applications). I hope I have made my point clear. As I said some schemes are simplified for easy understanding.

The files are not compiled for DirectX, how do you come to this conclusion?
The picture you referenced just shows that you can use SPIRV-Cross to “cross compile” the SPV representation back to GLSL, HLSL … and then use it for your desired API (GL, DirectX, Metal). This is another usecase of SPIR-V and doesn’t matter here.

If you take a look at the SPIR-V Ecosystem diagram:

Starting at the top middle. The shaders in the repository (.frag and .vert) are in GLSL!
They are compiled into .SPV using glslang, the GLSL Language Compiler (bin/glslc.exe in the Windows Vulkan SDK).

When constructing the graphics pipeline and creating the shader modules, the SPV code is directly loaded as uint32. Reading Standard Portable Intermediate Representation - Wikipedia
and other references, it seems like the intention of SPIR-V is to be direcly used by the API. Portable!

Also take a look here: MoltenVK | User Guide | About

[…] Metal uses a different shading language, the Metal Shading Language (MSL), than Vulkan, which uses SPIR-V. MoltenVK automatically converts your SPIR-V shaders to their MSL equivalents.

So am I overlooking something or what’s all this about?


Again: Have you mistaken the ImageType implementations as the context one? There are actually two methods to initialize Graphics.

  • One from a wrapped VulkanImage (VulkanImageType), which doesn’t use an OS surface.
  • Another using the VulkanContext and a swapchain image (framebuffer).

Because what you describe isn’t really happening.

  1. For each new frame (when an OS repaint happens) a swapchain image is acquired (not allocated). The allocation of this image depends on the window size and is managed by the OS.

  2. The graphics context (when the window is visible) creates and caches a VulkanImage (used as framebuffer on device local GPU Memory).

  3. The graphics class renders into this framebuffer. But only for invalidated regions.

  4. The whole framebuffer is rendered to the OS swapchain image/framebuffer using one fullscreen quad.

Aside from offscreen rendering, this is the standard method to directly render to an OS surface. Where is the high memory consumption and unnecessary CPU overhead?

I don’t think this is true. The Vulkan SDK on Mac is offering the regular headers and so on, true. But to actually implement the API, MoltenVK is used under the hood to translate the calls to the Metal equivalents →

MoltenVK

This SDK provides partial Vulkan support through the use of the MoltenVK library which is a “translation” or “porting” library that maps most of the Vulkan functionality to the underlying graphics support (via Metal) on macOS, iOS, and tvOS platforms.

Could it be that we talk about different things here? I’m a bit confused about what you mean with primitives and your assessment of what actually happens in the pw_vulkan and pw_vulkan_graphics module.

Is there any inherent limitation? I tried a project that works alright both with the software and OpenGL renderers. It has a fair amount of transparency and 30 fps path visualizations, but nothing outrageous. I get this kind of glitch on the whole window:
glitch
On debug, it blinks a number of times until it falls to the jassertfalse at line 132 of pw_RenderContext.cpp, with result == eTimeout. On release, it renders a few frames and then freezes, with all following calls falling to the return DrawStatus::hasFailed at line 113. Just in case, it’s a Zen+ cpu.

A spectacular glitch! Well, at least it’s not a black screen.

It was built to replicate all the things happening in juce_opengl, so there shouldn’t be any limitations. Looks like some kind of memory layout RGB glitch. Can’t tell much.

A few questions.

  1. Did you run the JUCE demos or a minimal project before, to be sure it’s generally working?

  2. The Vulkan SDK on debug outputs errors from the validation layer via DBG() macro. Is there no message? Normally it’s very specific to what causes errors in the API calls.

  3. Can you isolate some of the draw code if it’s simple enough. The stuff that happens in void paint(Graphics& g)?

  4. Are you doing unusual draw calls like beginTransparency layer. Reset save state. Or are you creating a temporary juce::Image that changes every frame?

Everything that could be somehow stateful and mess up the draw calls?
Yeah, without code this is hard. As always, minimal samples that reproduce the error are a good start.

Not a boring one at least :sweat_smile:

Actually, it fails with just this:

struct MyAudioProcessorEditor : juce::AudioProcessorEditor
{
    parawave::VulkanInstance instance;
    parawave::VulkanContext context;

    MyAudioProcessorEditor (MyAudioProcessor& p) : AudioProcessorEditor{ p }
    {
        context.setDefaultPhysicalDevice (instance);
        context.attachTo (*this);
        setSize (400, 400);
    }

    void paint(juce::Graphics& g) override
    {
        g.fillAll (juce::Colours::white);
    }

    JUCE_DECLARE_NON_COPYABLE_WITH_LEAK_DETECTOR (MyAudioProcessorEditor)
};

On debug, it stops at the assert, and there’s no other message.

Oh. That’s as minimal as it can get.

I see, so it’s this line:
const auto result = swapchain.acquireNextImage(swapchainImageIndex, imageAcquiredSemaphore); // TODO : aquire timeout ?

Notice the ominous comment :grimacing:

So what is happening here? The OS provides a managed image you can render to. A swapchain image. But it can’t just be used directly, it needs to be acquired first. If you resize the window, or other stuff happens it becomes vk::Result::eSuboptimalKHR or vk::Result::eErrorOutOfDateKHR. In your case the acquisition somehow failed in time, therefore a eTimeout. I never experienced this case, are you using anything unusual? CPU Graphics or an rare card?

. The acquireNextImage call on this line actually has a third parameter set as default.
vk::Result acquireNextImage(uint32_t& imageIndex, const VulkanSemaphore& signalSemaphore, juce::RelativeTime timeout = juce::RelativeTime::seconds(1.0)) const noexcept;
It’s set to 1 second. You could try to increase it and see if it still timeouts.

I guess it needs some extra handling. Maybe running the acquire in a loop with shorter timeouts until it finally gets the image :thinking:

Thanks for the feedback. Just shows that it was a good call to name it “beta”.

1 Like

On debug, it falls to that assert at 132. If I continue, then it keeps failing on imageCompletedFence.wait(). That also happens on release, where I can’t break at 131-133 (it seems the code is reordered by the compiler). If I extend the timeout to 10 seconds, it fails almost immediately. If I extend it to 20, it fails a little later, still less than 20 seconds. If I extend it to 60, it seems to go on “normally”, which in this case means alternating between these two frames:
ab
We have the green lines/dots, and the white alternating with black. When it fails, it can stop on either of them.

All this on a Ryzen 5 (Zen+) CPU with built-in (Radeon Vega) graphics, so it’s using RAM as video memory. The Vulkan driver (amdvlk64.dll) is loaded and unloaded a number of times. Some modules are loaded which seem a bit weird, like opengl32.dll.

I think the imageCompletedFence.wait() is a consequential failure.
The coarse process of this function is this:

  1. Acquire a swapchain image. (123)

  2. Render into an offscreen framebuffer. Leads to juce::Component paint(Graphics& g). (152)

  3. Render the offscreen framebuffer into an acquired swapchain image. (176)

  4. Present swapchain image. (190)

If you present the image you command something like: “Present this image, and if you completed the presentation, signal this fence!”
Then on the next frame you wait for this fence (104), until it’s signaled. Before you start the acquisition of the next swapchain image and repeat the whole process.

Now, if the acquisition fails due to a timeout, it is never presented. Therefore the wait for this fence also fails.


Built-in graphics are always worrisome. Just to be sure, running C:\VulkanSDK\1.2.170.0\Bin\vkcubepp.exe runs fine and presents the textured rotating cube?

The green dots are definitely strange. If something is pure R G or B, in a pattern, it’s mostly related to wrong memory layout and pixel stride. But it could just be a consequence of the wrong swapchain acquisition.

Not sure how to test it without integrated graphics. It runs here on Zen2 with dedicated graphics.

Anyone here experiencing similar problems?

The cube is alright:
cube
The only error coming from drawFrame is the timeout, after which there’s no more drawing. The flickering and green dotting happen before though, and nothing seems to complain about it. No idea what else I could test, tell me if you think of anything.

I thought about how to approach this issue. Apparently the Vulkan validation layers offer VK_LAYER_LUNARG_device_simulation.

https://vulkan.lunarg.com/doc/view/1.1.114.0/windows/device_simulation_layer.html

Maybe it’s possible to reduce the capabilities to simulate the AMD Vega built-in CPU graphics. Or a wide range of other devices.

Although I tried to only use the widely supported image formats and features in the library,
it’s still possible hat some of the image formats or features are not available on Vega. Or other integrated graphics solutions.

Definitely need more testers.

Maybe you can find something useful here:
report.txt (21.9 KB)