Anouncing "pluginval" an open-source cross-platform plugin validation tool


#1

Hi everyone, I’m excited to announce the release of a tool I’ve been working on for validating plugins. It’s useful for plugin and host developers and I even want to encourage non-developer end-users to test their plugins with it in order to give plugin developers the most detailed bug reports possible.

You can find the project on our GitHub here:

Grab the release binaries or clone the repo and browse the source code. I’m actively encouraging everyone to contribute to this project in order to make it the most comprehensive and useful tool for testing plugins there is. With this in mind I’ve tried to make it super easy to write tests using a very similar approach to the JUCE UnitTest framework (with a few tweaks to support being provided with a plugin instance).
If you’ve got a special case you found a host doing that you want to test for, it’s dead simple to add a test for this and have it run automatically as part of the build process.

The app can be run in a GUI mode with a UI or as a headless command line tool so it’s designed for QA teams/end-users as well as automated CI tools. In that respect it similar to auval but for all platforms and plugin formats.

There’s a more detailed explanation in the repo readme here:
pluginval README.md

Looking forward to hearing everyones thoughts and feedback!


DAW plugin hosting specifics summary doc
Validation Failure
Unit Tests, CI - How to get started building a reliable JUCE module based codebase? What software and hardware tools do I need?
#2

Hi,

thanks, looks like a great tool!
I’ve built it under windows (using install/windows_build.bat) and it compiles successfully.
However when I launch the GUI and add VST2 and VST3 paths, VST3 plug-ins are not seen, and testing a VST2 plug-in doesn’t show anything in the console, just the led going from red to green and back again.

Should the VST3 SDK be put somewhere for VST3’s to be recognized perhaps ?

Also launching via command line fails with *** FAILED: VALIDATION CRASHED most of the time.
I got it working and validating the plug-in successfully once in probably 10 attempts (this plug-in works perfectly in all major hosts).
Perhaps a progress message of some kind would be nice !

May I suggest you announce your tool on the kvr audio development forum, and perhaps https://sdk.steinberg.net/index.php also ?

Thanks again,
Lorcan


#3

Nice I’ll be taking a look at this later as I’ve been working on the same thing!


#4

You’ll have to install the VST3 SDK then modify the Projucer to find it, enable VST3 support in the module and rebuild. This is because I can’t ship with the VST3 SDK due to licensing restrictions from Steinberg (unless they’re relaxed this recently?)

Ok, I’ll take a look in to that. I did try it on Windows but have to admit most of the development time has been on the Mac.

Yes, I will post it on there once the initial kinks are ironed out :wink:

Thanks for the feedback!


#5

Thanks. Actually Steinberg have announced dual commercial/GPL3 licensing for VST3 some time ago: https://sdk.steinberg.net/viewtopic.php?f=4&t=282
I’ll test on macOS too.

Cheers,
Lorcan


#6

Does this mean I can simply include the source in my own repo as long as my code is GPLv3?

If so, why does the JUCE library not already include VST3 code? Is it due to the triple licensing of JUCE?

I’m happy to add the VST3 source but I’m not great with licensing and don’t want to run afoul of Steinberg.


#7

You don’t even need to include it directly, you can just use the whole git repo as a sub-module:

You probably only need those as you’re not interested in vstgui, docs, examples, etc:




#8

This is awesome!

In the audio processing tests:

ut.expectEquals (countSubnormals (ab), 0, "Subnormals found in buffer");

Are subnormals really considered a failure for all plugins? I’m interested to see if most commercial plugins pass this test.

Early feature requests: pass in sample rate/buffer size combinations as command line arguments or in a config file for the tests, and optional failure criteria for the editor/processor test (like under a particular time to load).


#9

Yeah, I wasn’t completely sure about this. Maybe they would be better in a 7/8 strictnessLevel test. But that would mean having to run processing tests again simply to check for them.


The other potential way of dealing with this is to include a “warning” test, kind of like a hopefullyEquals (...) function. Then you could pass a flag to treat warnings as errors.

We did discuss this but it would either need changes to the UnitTest framework or additional complexity for the running of tests. We decided that at least initially using the numbered strictnessLevel technique we could achieve all of this.

But you’re right, subnormals shouldn’t really be level 3 failures. I was mostly interested to see if plugins omitted them and we needed to account for this as hosts…


#10

Awesome, thank you Dave!

Might be worth mentioning in the README that you can drag and drop a plugin over the interface to add it to the list (like in Tracktion/Waveform). To avoid a rather time consuming scan of maybe hundreds of plugins.


#11

this looks great, but i am wondering does it rely on juce’s plug-in hosting classes? Is this really testing compliance with those classes? My feeling is that for VST3 and AU in all their weirdness i’d feel more comfortable testing with steinberg and apple’s existing tools - not to say this isn’t valuable in addition


#12

What plugin is this? I just tried with a few of our own and some NI ones and they seemed to work ok…

Could it be that you’re using a strictness level greater than 5?
There’s a particularly nasty test at level 7 which emulates some Pro Tools behaviour by loading the plugin on a background thread, opening the UI on the message thread and then setting the state on a background thread. This probably won’t go down well with most Windows plugins but could be useful if you really want to make your plugins bullet proof.

I’ve tweaked the app’s GUI now to show the strictness level and make that adjustable. I also realised it was using level 10 in GUI mode so I’ve changed that.

You might want to pick a plugin, start at level 1 and then increase it until it fails? Hopefully that will narrow things down a bit.

There should be lots of logging going on already though. It should be fairly obvious if a test is failing/crashing from the log. If it’s not getting that far, it’s probably failing to create the child process for some reason. If so, let me know and I’ll see if I can improve that.


#13

Yes, I guess I can as I’m already including JUCE as a submodule. I’ll see what bits I need to build with it. Cheers!


#14

In my opinion it seems music software would be more reliable if hosts were more consistent in sticking to standard conventions. Perhaps we need a host testing framework too?


#15

Added support for VST3 now: https://github.com/Tracktion/pluginval/commit/d0836ab4a935829c50e566237084f2b872f722ee


#16

I’d love one. Show me a spec and I’ll adhere to it. One of the biggest problems with audio plugins, both from the plugin and host side is that this is largely un-speced. Things have improved a bit in recent years with VST3 and AU validation tools but as far as I’m aware there is no concrete written spec about the order in which you can call things or on which threads.

All hosts are littered with hacks and work arounds for specific plugins that behave slightly out of the ordinary. Sometimes due to bad coding, but most often due to simply not having a reference for how something should behave.

Passing a validation is not the same as conforming to a spec. It simply means you pass the tests thrown at you. This may be fine if you’re the host that wrote the test i.e. you’ll only ever do the same things as the validator app (which is possible with Steinberg and the VST3 validator), but how does another host know exactly what it can and can’t do?

This tool isn’t designed to replace existing validation tools or specs, indeed it’s there to enforce them and provide a quick means of testing them or pushing the boundaries. My ultimate goal with this project is to create a set of tests (on level 10) so exhaustive and multi-threaded that it should be possible to write a near-bulletproof plugin.

Running this over your plugin from the start should help anyone catch design problems before you’ve launched and then a user tries your plugin in an obscure host you’ve not heard of before, let alone tested in…


As you mention though, this tool is also designed to be a “host testing” framework. Maybe not in the traditional sense but with this I can scan my entire plugin list (some several hundred) and run the validation tests on all of these plugins. This can be done periodically on a CI machine too.

With this data, we can very quickly see what plugins have issues where.
If all plugins fail a specific test, it’s probably a good idea to fix that behaviour in the host and remove the test for plugins.

Alternatively, if only a small amount of plugins fail a specific test, we can use these logs to inform the developers and maybe work around this small number in our hosts.

The main aim here is to speed up this data gathering process that’s usually only done out in the wild where it’s often too late to make changes.

I hope that helps explain things a little more and how I can see this being used!


#17

I think the key thing is here that both host and plugin developers need to be contacting each other when there are issues to avoid these hacks where possible, all to often it’s easier to just add the hack and be done with it. That’s not a dig at you or anyone specifically but to all of us in the industry in general.


#18

Yes, I completely agree. The main problem is that a lot of plugin developers are either too small to have the time to communicate with host devs and then fix things or they’re too big and making fixes to accommodate a single host is outside their dev cycle.

I’ve experienced both sides of the coin over many years spent writing a host and plugins. It’s a fantastic feeling when you help someone make their product better by reporting a bug that they are willing and able to fix, both plugin and host side. Unfortunately I’ve also wasted a lot of time registering with sites, downloading products, installing, scanning, authorising, creating sessions to test specific behaviour, getting crash traces, contacting developers, explaining in detail exactly why the software is crashing and why, all to hear back nothing or have no action taken on it. That’s disheartening.

One of the main aims of this tool is to make it less host/plugin specific and more general. Hopefully that way it can become a pseudo standard and basically help host and plugin devs spot problems early. This is one of the reasons it needs contributions from both sides of the land. We need to work together to make the whole audio software experience a good one for users.


#19

I couldn’t agree more, it’s pretty much the same reason I was writing a plugin validator too. I haven’t taken a good look at this yet but one feature that would be useful (and more likely to be adopted by bigger players) is to have the ability to change the output data so for example it could output a JSON or XML file in a particular format that conforms to a unit test standard that can be parsed by CI tools.


#20

Yes, it does and that’s intentional. Essentially this acts in two ways:

  1. For plugin developers to ensure compatibility with JUCE based hosts (and also as a general testing/validation tool, used in addition to the format-specific tools)
  2. For JUCE based host developers (possibly including JUCE itself :wink:) to test compatibility with plugins from all formats and built with all frameworks

It might be easiest if I run over some problem areas/workflows which we will be putting in to practice at Tracktion:


• For Host Developer QA Teams:

Testing plugin compatibility with your host can be an extremely tedious process. Simply getting hold of, updating, registering etc. plugins is extremely time consuming. At Tracktion, we have a contracted QA company (which I can highly recommend if anyone’s interested) so they have a huge database of plugins etc. already installed. I saw an opportunity here for them to periodically run pluginval on their installed plugins, across multiple formats on multiple OSes all essentially automated. They can then give us the reports and let us know if any plugins fail tests which likely mean there will be compatibility problems with Tracktion/Waveform.
We can then get them to do targeted testing with the actual DAW.

Originally, I had this plan to automate the DAW in this way but it seemed better to do it as an external tool and open source it so it can be used by everyone.


• For Host Developer CI:

Similarly to above, if you have a CI setup, you can add a job to periodically run pluginval on any installed plugins and find out compatibility problems early. This will most likely be used to spot regression testing.
I’ve tried very hard to make pluginval as simple to clone or download binaries for precisely for these kind of CI workflows.

The main advantage of having QA run tests like these is that they will already have to spend time updating plugins etc. The set on the build server is likely to be much reduced simply due to time constraints/expiring licences (iLok anyone?) etc. The larger the company, the more likely you are to have a dedicated person to manage this. Unfortunately at Tracktion we’re not quite there yet so we’ll use a combination.


• For Plugin Developer CI:

If you’re a plugin developer, you’ll likely know that testing with the dozens of hosts out there is astoundingly time consuming. pluginval aims to speed up that process by offering an automated way to check for silly mistakes, regressions and even things that are stricter than most hosts will require.

This should lead to a better overall level of software development and increased visibility in the space. Remember, with an open source tool we can add comments and log messages to explain to developers (particularly those new to the domain) why tests are failing and point them in the right direction.

I imagine fuzz testing here will come in extremely useful, particularly for things such as race conditions that can happen rarely and are very difficult to reproduce. Run some fuzzing for a few hours over your plugins and see what happens…

The other reason I like this CI-tooled approach is that developers (myself included) often respond better to machines telling them they’ve done something wrong. Getting a Jenkins email telling you you’ve broken the build due to some failing tests seems easier to take than a person telling you you’ve broken something. (Jenkins then nagging you every time you commit reinforces this message). You can then tidy it up, push it and no one else may even notice…

Of course pluginval should be used in conjunction with existing tools, particularly on automated CI as they don’t take any extra development time to run.


• For Plugin Developers:

Again similarly to above but having an open source project that you can actually run with a debugger attached to see where tests are failing is super useful.

The other side to this which I’ve not discussed yet is essentially some kind of “recognised” or “compatible with” acknowledgement status. At Tracktion, we love to promote plugin developers making cool software and particularly if you take the time to make sure your plugins work with our DAW. We want to add a compatibility table to our own website in order to do this (our user base can then look at this to hopefully find interesting new plugins) but we know testing with hosts is time consuming. If you can send us a passing pluginval report, we’re 95% sure your plugin will be compatible with us and we can add you to this list.

Who knows, in the future we might even add a GitHub badge!


• For Host End Users:

For users of our host, we wanted to provide a tool that they can run on problematic plugins to generate a log file which they can then send to us and plugin developers in order to give us a head start on fixing the problem.

We could even automate the collection of these so we can find the most popular, problematic plugins.


• For Plugin End Users:

Again, similarly to above but end users can use this tool to test their own installed plugins and generate reports to send to the plugin devs. This can save a lot of time and there are many users out there willing to do this providing the process is easy and quick enough.


I think that covers the main ways I see this being used but I’m sure there are many more which I’m excited to hear about!