[DSP module discussion] Let's talk about the new dsp::Processor classes

dsp_module

#1

Hello everybody !

Today, I release the last topic about the new DSP module classes. If you want to read the previous ones, you can find them here : https://musicalentropy.github.io/JUCE-DSP-module/

Let’s not forget the topic about the class dsp::LadderFilter developed just before the ADC workshops as well : New LadderFilter class

We are going to talk about the classes dsp::ProcessContext, dsp::ProcessorChain, dsp::ProcessorDuplicator, dsp::ProcessorWrapper, the structures dsp::ProcessSpec and dsp::ProcessorState, and all the classes derived from this available in the DSP module :

  • A new take on the new filters classes dsp::IIR, dsp::StateVariableFilter, dsp::LadderFilter and dsp::FIR which use extensively the internal structure dsp::ProcessorState
  • The API choices for the classes dsp::Convolution and dsp::Oversampling
  • All the more or less “processor examples” classes provided in the folder juce_dsp/processors, which are the new classes dsp::Bias, dsp::Gain, dsp::Oscillator, dsp::Reverb and dsp::Waveshaper

Obviously all these classes use a lot the mighty dsp::AudioBlock which is probably the one we have talked the most since the release of the DSP module, and to be clear again about it, I think it is awesome and I use it all the time nowadays. You can read again in that topic why : [DSP module discussion] New AudioBlock class

Now, I have to confess I don’t use any of the dsp::Processor classes in my work, I’ll tell you why later in the discussion. But I would like to get some feedback from you guys, and like in the other DSP module topics, I’ll provide some documentation here and some context about the development.

Problems solved by these classes

So, the new Processor classes are the JUCE team attempt to standardize audio classes which are not the AudioProcessor class, in a way which is compatible with the concept of audio graphs. They were mostly the work of @fabian, @jules and @skot I think.

The idea was to give a new take to that graph concept, which was as you might know already covered in the past with the classes AudioProcessorGraph and all the classes AudioSource. Why doing this again ? According to @jules himself, these classes are not that good by the current JUCE team standards. Moreover, they wanted something that could be used with a single process call after all the necessary initialization, with easy multi channel capabilities that became possible thanks to the development of the class AudioBlock.

I guess it was also an attempt to provide an elegant solution to that first most common programming mistake that we see on the forum as well :slight_smile: Indeed, having classes that are supposed to handle by themselved AudioBlock objects where the user needs to provide the number of channels in the prepare function means the developer must ask himself the multi-channel question early during the coding process. Moreover, the ProcessorDuplicator class allows to generate a multi-channel class with multi-channel capabilities based on a mono processor class if it has designed to be augmented in a given way we will talk about later.

Documentation

How to use them ? The first thing to know is that you can find most of the examples you need in the JUCE examples folder, specifically with the demo application DSPDemo and the demo plug-in DSP module plug-in demo. You’ll notice the following things :

  • All the new classes can be initialized using a prepare function call instead of the classic prepareToPlay, using a ProcessSpec object providing the sample rate and the maximum audio buffer size as usual, but also the number of channels that the audio class is supposed to handle, which is the new thing here, and which is very important.
  • A lot of processing classes have become more or less automatically templated. That means that it is now possible to use them with 64 bits samples (double) and not only 32 bits (float). Thanks to the addition of the SIMDRegister class, most of the DSP module processing classes are SIMD compatible as well with only a change in the template type argument !
  • To process any sample, you need now to use additional abstractions now. The first one is the conversion of an array of samples or an AudioBuffer into an AudioBlock object. The second one is the use of a ProcessContext object, either a ProcessContextReplacing or a ProcessContextNonReplacing, which allows the processing graph to use either one audio buffer for the whole process, or two different one to keep the input samples unchanged and store the output samples in a separate buffer.
  • By the way, the ProcessContext classes have a public property called isBypassed which can be used for… bypassing the processing chain, and they give some pointers to the assigned input and output AudioBlock objects.
  • It is now possible to stack several compatible processing classes in a ProcessingChain object simply by using their name in the template definition of the object. So if you want to code an overdrive plug-in based on JUCE only classes, you can initialize a ProcessorChain object this way :
dsp::ProcessorChain<GainProcessor, BiasProcessor, DriveProcessor, DCFilter, GainProcessor> overdrive;

Obviously you need to do this before :

using GainProcessor   = dsp::Gain<float>;
using BiasProcessor   = dsp::Bias<float>;
using DriveProcessor  = dsp::WaveShaper<float>;
using DCFilter        = dsp::ProcessorDuplicator<dsp::IIR::Filter<float>, dsp::IIR::Coefficients<float>>; 
  • Using lambda functions, it is possible to initialize some simple waveshapers and oscillators in your headers, using the DSP module processor classes Waveshaper and Oscillator. They serve that simplicity purpose as a priority, and as some users reported in the forums, they are not supposed to be used in complex commercial plug-ins where you might need faster and anti-aliased equivalents.
  • The new class dsp::IIR::Filter is a duplicate of the original IIRFilter class with new capabilities (the compatibility with the FilterDesign classes, the possibility to handle filters at any order and not just biquads, the functions to calculate the frequency response for both magnitude and phase). Other new classes such as StateVariableFilter and FIR have been designed the same way. But let’s face the main issue there, in order for that class to be compatible with templating, SIMDRegister, ProcessorDuplicator, and the whole ProcessorChain stuff at the same time, the use of the new IIR filters class is a lot more complex than the use of the previous one. I can tell @fabian and @jules had a hard time figuring out how to code these beast the right way as well. And I personally think it is impossible for a beginner to find out by himself this kind of syntax to init its behaviour :
*lowPassFilter.state  = *dsp::IIR::Coefficients<float>::makeLowPass  (getSampleRate(), lowPassFilterFreqParam->get());

That’s why examples exist of course !

  • Indeed, IIR::Filter was the perfect example of a mono processor class which might need to be augmented with multi-channel capabilities using the ProcessorDuplicator class, so that it could be treated as a whole ProcessorChain compatible class. But for this to happen, it sounds logical that we might need some data to be different for every channel (the memory for previous samples) and some data to be the same everywhere and assignable in one go (the coefficients of the filters and all the associated processing internal variables). For this to happen, the current implementation of the ProcessorDuplicator class needs to have an access to a given “State” class which is there to provide the common data for all the mono processor instances. That means that your mono processing class must provide a “State” class as well, which can be provided in the templated definition of ProcessorDuplicator class.
  • One odd thing, the demo application uses the class ProcessorWrapper, but the “example” classes don’t use it (the classes dsp::Bias, dsp::Gain, dsp::Oscillator, dsp::Reverb and dsp::Waveshaper), and still they are compatible with ProcessorChain ! (it’s magical)

If you need any information about all this stuff, just looking into the base code, the JUCE website classes documentation, and inti these examples should give you all the answers to your questions, and some code to copy and paste as well. If you don’t understand anything else, feel free to send your questions here of course.

Let’s add some additional remarks :

  • If you want to create your own audio processing classes so they are compatible with ProcessorChain, the obvious choice is to derive from ProcessorWrapper, but you can also follow the “examples” processor classes included in the DSP module
  • I think you can learn a lot about templating and other topics simply by looking into the source code of these classes so don’t hesitate to do so !
  • It is possible to use the dsp::FastMathApproximations classes or anything based on dsp::LookupTable with the Oscillatorand Waveshaper classes
  • The class dsp::Convolution is compatible with ProcessorChain, but that’s not the case for Oversampling since its use is singular in comparison with simple processing classes.

Why your feedback is going to be important

So I would like to know what you think about all of this. What’s your experience with the Processor classes ? Did it change your habits ? Have you been influenced by the way these classes have been written in the way you code now ?

I’ll write something very soon about the audio parameters in JUCE since the JUCE team has stated that they are going to start working on them again recently : We are removing AudioProcessor-based parameter management . I think this upcoming development is going to have some interaction with the way the audio processor classes are designed, and that it might imply some updates in the DSP module Processor classes. That’s why your feedback is important here so that the JUCE team will have more information to guide their take on audio parameters.

Tell me if you have any remark, and @fabian and @jules don’t hesitate to correct me if needed.


#2

Frankly, I haven’t found much use for them. I would like to be able to dynamically construct and manipulate DSP graphs but the DSP module design seems to be geared for compile time constructs. (Maybe I am wrong and the stuff can be used dynamically at runtime without things getting too contrived…?)

Even simple things seem to be quite difficult and verbose to do with the new DSP module classes. (Yes, I know the examples exist, but what if the examples don’t cover a particular use case in any obvious way?)

I would have liked to see development (such as multicore CPU support) on the AudioProcessorGraph or a new equivalent. (“Equivalent” because maybe a similar thing could be done with a bit lighter base class for the nodes than what AudioProcessor is.)


#3

I would have liked to see development (such as multicore CPU support) on the AudioProcessorGraph or a new equivalent.

This.


#4

A big advantage I see in the new templated classes is, that the structures are created at compile time rather than a long dynamic sequence of calls in the constructor to setup a graph.
Most plugins will have a fixed processing chain anyway, so this increases performance for the user. It also means, that the compiler helps to spot errors at an early stage.

I agree, that it is good, if you can create a processor graph on the fly, so I would appreciate to have some dynamic glue in the future, similar to the AudioProcessorGraph. But to be honest, it is probably a minority of projects that need that.

I didn’t start many projects recently, that’s why I also haven’t much experience with it yet, except of a few proof of concepts, but I will definitively use them for anything coming up.


#5

Omnisphere effects, Amplitube, Guitar Rig 3/4+, Superior Drummer, Kontakt, Ozone, Logic’s own Guitar Pedalboard etc. all claim otherwise. I think that having dynamic processing chains should be the default approach, as then the compiler can decide if the chain will never change and optimize accordingly.


#6

I never said that they don’t exist, but you have to agree that the majority is a set chain of filters and other processors. You still can parameterise each processor.
Also I started my sentence with “I agree, that it is good, if you can create a processor graph on the fly”, so I don’t see a contradiction here…

When the plugin has the option to change the processing chain on the users choice, the compiler is long out of the game…

I understand both options should be possible, just the new approach added the statically optimised version, which I was welcoming. I don’t see anything wrong with that…

Cheers


#7

Glad to see this topic come up! The DSP module came out just as I was starting a new project, so I’ve built my signal chain almost entirely using these new processor constructs.

Pros:
One thing I’ve been thoroughly enjoying is the ease of composition. I wrote my own ProcessorParallel class, like ProcessorChain but which runs each processor in parallel and sums the results into the output buffer. This allows me to define relatively complicated signal flow trees with ease:

public:
using Lows = ProcessorChain<Lowpass<float>, Bias<float>, Gain<float>, WaveShaper<float>>;
using Highs = ProcessorChain<Highpass<float>, Bias<float>, Gain<float>, WaveShaper<float>>;

ProcessorParallel<Lows, Highs> m_bandSplit;

And with one prepare and one process call I get all of that signal chain computed easily and efficiently.

I also find the Processor examples very easy to follow, I’ve written many of my own processor nodes that are all compatible with this chain. I also wrapped the dsp::Oversampling class with a ProcessorOversampling<> class which composes a child and runs it on the oversampled buffer: ProcessorOversampling<float, WaveShaper<float>>, for example. This way I can use the oversampling code in my compositional definitions.

Cons:
I’ll echo what’s been said above about runtime rearrangement of nodes, but aside from that I’ve found a couple other things that are really complicated by these Processor classes:

When I want to reach into my structure to update values based on processor parameters, the nested .get<0>().get<1>().get<0>()… is brittle and hard to reason about. If I swap my processor definition around at all, I have a lot of indices to go back and edit. I’ve considered adding an enum value to each of my processor templates: Gain<float, kLowGainNodeId> such that I can m_bandSplit.getById<kLowGainNodeId>() to resolve this problem, which I found fairly straightforward using boost::Hana, but I’m not sure that’s the best approach.

Another thing that I’m not happy with is that this format is great for defining trees at compile time, not graphs. If I have a node that needs input from its parent and a grandparent, how do I get the appropriate buffers to that node? Every process call takes only one AudioBlock.
Surely I could stuff an arbitrary number of channels into one AudioBlock such that the first 2 are input data, the second 2 are modulator data, or whatever. But then it becomes really hard to write generic processors: The first N channels are input, the remaining M are what? How big is N? M? My Gain nodes are written such that they apply gain to an arbitrary number of channels, but now they have to know to avoid the Nth through Mth channels because some node deeper in the tree is expecting [N, M] to be modulator data?

I went down that road, it gets convoluted fast. I’ve since backed out and used a different approach with a different ProcessorParallel implementation which does not merge its channels, and provides a public method to get a reference to the AudioBlock containing the Nth through Mth channel so that a child can ask a grandparent for the right buffer, and my top-level Processor is responsible for establishing all these connections. It works… but it feels wrong. I don’t think it’s flexible enough to, for example, route LFO1 to modulate a node gain value and then route LFO2 to modulate LFO1->Gain amount.

So, in summary, I think these classes have been great for my project, and maybe I’m trying to make something out of them that they were not intended for, but if we can solve the following three problems I’ll be super excited:

  1. Querying for nodes in the tree easily.
  2. Runtime rearrangement of nodes.
  3. Establishing a sensible API for graph-like connections (as opposed to strictly tree-like connections).

Thanks @IvanC!


#8

Yes, the .get<0>() notation to reach into the ProcessorChain is very convoluted and unintuitive. So far, that is my biggest complaint.

It might be good to have more comprehensive examples also. Those provided are actually quite simplistic.


#9

I’d be curious to hear more on this. Sure, JUCE gets tricky and a little hairy the closer you get to working directly with audio hardware and individual samples, but it’s still many magnitudes better than dealing with Steinberg or Apple APIs directly. And I agree with what others have said re: dynamic composition.

So far my use of the DSP module has been building lots of tiny building blocks, then gluing them all together inside an AudioProcessor to be set up in a graph at runtime.

Question: how are you avoiding priority inversion with this? Also, I assume you’re reusing threads - are you using a ThreadPool per ProcessorParallel, submitting jobs to a global object, or what?


#10

The AudioProcessorGraph bottleneck isn’t the processing… it’s sorting through all the nodes… the challenge will be improving that aspect of the code no matter how you handle the processing.

And for the record… I agree with Xenakios… multi-core support would be great (since it’s another short-coming of AudioProcessorGraph)

Cheers,

Rail


#11

I’ve been hesitant to voice my opinions about the new dsp classes. I’m sure a lot of hard work went into it and I’d hate to offend. And while I’ve been programming graphics for many years, I’m relatively new to audio programming. I also wonder how much my assembly language/straight C roots weigh on my disappointment vs the overall direction that C++ is going.

So yeah, I’m a little disappointed with the new dsp classes. I was hoping for a bunch of smaller, more general use lightweight classes (like the older IIRFilter) instead of an interdependent collection of classes. To my eyes, it seems overly clever with the amount of templating. I guess I’m much more use to “close to the metal” programming. Meaning, I’m just fine with having many of my own dsp classes have a prepareToPlay(float sampleRate, int samplesPerBlock) and process(int numSamples, float* inBuffer, float* outBuffer) rather than passing around a ProcessSpec and AudioBlock. When passing pointers to buffers, they’re ready for use vs when passing AudioBlocks, you still have to get the pointers every time. That’s just a simple example.

The other day I thought about using a polyphase filter in my own oversampling code and thought to look at the one inside the dsp module. After a few minutes, my head hurt and gave up.

Also, for as much as users are told on this forum to read through the JUCE source code to understand what’s going on, why the proliferation of auto variables? To me, it just makes it harder to read and understand what’s going on.

I realize that uint32 may be more technically correct, this looks overwrought and harder to read.

dsp::ProcessSpec spec {sampleRate, static_cast<uint32>(samplesPerBlock), static_cast<uint32>(numChannels)};	
    
oversampling->initProcessing (static_cast<size_t> (samplesPerBlock));

Wouldn’t a simple int suffice?

Personally, I’d much rather have a self contained convolution and filter classes. Again, please forgive my old fashioned-ness or if I’ve misunderstood the overall idea. I like my to know and understand every line of code in my plugins and have them run very tight and efficient. I’ve just started playing with this module, but when I read about .get<0>().get<1>().get<0>()… above, it’s a pretty hard sell for me to keep checking it out.


#12

Ah, I’m realizing I could have been more clear there. My ProcessorParallel class does not actually run processors in separate threads, it’s perhaps just a poorly chosen name. The dsp::ProcessorChain class lets you run arbitrary processors in series, each working on the same AudioBlock after its previous sibling had a chance to run. My ProcessorParallel class allocates separate buffers for each of the N processors and uses a ProcessContextNonReplacing to have them each write their output to the appropriate separate buffer. Perhaps a better way to put it is that dsp::ProcessorChain lets me control the depth of my tree, and ProcessorParallel lets me control the breadth. I’ll upload the class so you can take a look at it (feedback welcome!): ProcessorParallel.h (4.2 KB)

My toplevel processor can use this to ask ProcessorParallel nodes for their internal AudioBlock, so that I can string modulator data through to other nodes that need it, for example.


#13

Hey JUCE team, any updates/thoughts on this conversation?