Getting started with audio processing

StrangeMan · September 5, 2012, 10:18pm

Hello Community!

Once more, I would appreciate some hint and ideas from you.
One of my current projects is in a stage of defining its structure and interfaces. Now, that audio processing is completely new to me, I find it pretty difficult to define my interfaces right and efficient. The web gives plenty of information on digital audio processing in general, but aside from the stk-tutorial, nothing really seemed to help me. The huge number of JUCE classes related to audio processing is really overwhelming and I find it hard to find an entry into this world of audio programming. That’s why I ask here.

What I am trying to create is a structure of independent modules, each receiving and producing audio and midi data. Much like in a modular synthesizer, I’d like to create connections between those modules so the output (audio as well as midi) of one can be fed into the input(s) of other modules. The problems begin with the form of processing. From what I read so far, two concepts are available, both having their pro’s and con’s:
[list=1][]Sample-based processing. This is generally slower due to higher overhead. Also, not all kinds of algorithms can be implemented this way (e.g. FFT). This can only be transformed into Frame-based-processing by introducing additional delay.[/]
[]Frame-based processing. Faster due to less overhead. Frame-based processing can easily be disassembled into sample-based processing using a simple loop - without introducing delay. [/][/list]
Now, when I think at my connections between my modules, I understand, how this could be implemented using the sample-based approach. Every output just calls all the connected inputs. Feedback-loops could be resolved by introducing a one-sample delay. Even mixing audio together seems easy.
However, I have no idea, how to realize those connections with the frame-based approach.
So this would be my first question to you: What is the way to go for me?

As a second question I’d like to ask you: Where do I start understanding the JUCE audio classes? Most likely there is a common logic that can be found in any of those classes, but I haven’t had much success finding it.

Thank you for your answers,
StrangeMan

PS: If you know a link to good example codes (preferably juce-based) or tutorials: Please post them, too!

jrlanglois · September 5, 2012, 11:08pm

All in all, everything related to your questions can be answered by looking into the following classes, and how they work together:

[list]
[]juce::AudioDeviceManager[/]
[]juce::AudioProcessorPlayer[/]
[]juce::AudioProcessorGraph[/]
[]juce::AudioProcessor[/][/list]

As for your questioning of “processing concepts” - everything boils down to samples. JUCE starts from there, at the basis for dealing with sound in the digital realm, and you can implement your “frame-based processing” atop of such as your own custom system, if that’s what you require.

jrlanglois · September 5, 2012, 11:12pm

If you are looking into developing your own plugins, take a look at the Introjucer (located in the extras folder from the git repository).

It has a handy project setup for easily creating VSTs, RTAS, AAX and AudioUnit plugins.

StrangeMan · September 6, 2012, 7:08am

Hello!

Thank you for your answer. I’m already familiar with the Introjucer - a great tool for managing a codebase that should compile on multiple platforms. My current project is a plugin and i already use the Introjucer for it.

Thank you for the list of classes. I’ve already spend some time, reading through the AudioProcessor class as it really seems to be a base class. Most of it already seems clear to me, except for the processing part. The main point is, that there is only this “processBlock” and I can’t really imagine how to work with multiple coupled classes of “AudioProcessor”.
I’ll try to explain:

I:

Imagine I have the following configuration of two AudioProcessors A and B (one in, one out), as well as a C (two in, one out): An input source is fed into A and B. Both process their input and pass it to an individual input of C. C mixes the audio back together. Now, let’s say the sound source has a new block ready to be processed. It would e.g. pass it to A first. A processes and passes it to C by calling the processBlock method of C. C knows, that it has two connected inputs but got data from the first only. It therefore writes the received audio in a buffer for later processing. It then returns from C::processBlock. Now we’re back in A::processBlock. Here’s nothing left to do, so this will return, too. Now we’re back at the audio source which now calls B::processBlock. B processes its audio and calls C::processBlock. C now has all data it needs and starts mangling together the sound from A (stored in the buffer) and B (just received).

(I hope that was clear)

Now, that’s fine so far. However one problem exists: If they are all AudioProcessors, then it would not work. Simply because C would not wait for it’s second input to be ready. Also, what would happen, if one of C’s inputs is not connected? C would need to know about the state of it’s inputs - AFAIK the AudioProcessor class does not have such intelligence. So the only solution to this is setting up a Manager, that controls multiple AudiProcessor and calls their processBlock, when all data is available. AudioProcessor would not know, where they are connected to. From what I already saw, this is what the AudioProcessorGraph does.

So from what I see right now, it would be necessary to implement all of my “modules” (mentioned in the first post) as AudioProcessors (or as AudioProcessorGraphs for they inherit AudioProcessor) and set up one AudioProcessorGraph that manages the others and their connections globally. Is that right?

[quote=“jrlanglois”]and you can implement your “frame-based processing” atop of such as your own custom system, if that’s what you require.[/quote] I don’t understand that sentence. AudioProcessor already implements a frame-based processing, doesn’t it?

Also you did not mention the AudioPlayHead so far. It seems to be there to supply additional information about the transport. But what if I asked it about the current ppq-position. It returns (kind of) one absolute position. But the processBlock method does not represent one point in time - it’s more like a small time slice. So, where is the ppq-position located in that time-slice? At the beginning, the first sample of that block? That would mean, if I have an AudioProcessor sub-class I could decide break down my processing into a sample-based processing and calculate the ppq-position for each sample using the starting point and the current sample rate and bpm. Again: Is that right?

Thank you,
StrangeMan

StrangeMan · September 6, 2012, 8:36am

I have some more questions to you. Some of the AudioProcessor methods need clarification for me…

First: programs
AFAIK programs seem to be what is commonly known as presets/patches - a set of configuration details that describe all of the options available in a plugin. [list=1][]Does the host do separate calls to set the parameters, or is all information about the parameters contained inside the binary blob that is stored by the host? That means: Does the AudioProcessor need to take care of saving its parameters into that blob or not?[/]
[] The number of programs must not change - how is that possible? The user might save a new preset from the current setup and bling the number of programs changed. [/][/list]

Second: parameters
AFAIK parameters in the context of the AudioProcessor are those controls/parameters of a plugin, that can be controlled from the outside, e.g. via parameter automation in a host.
[list=1][*]The number of parameters as well as their names is fixed and must not change during existence of the AudioProcessor (So written in the description for getNumParameters()). I noticed, that commercial vst-plugins that have a varying number of parameters (such as NI Reaktor (which i have used a lot) or NI Guitar Rig) simply tell the host, that they have a huge number of parameters available. Their names are only numbers. When internally they create a new control that should be automatable, they assign a free one of these to the control. That seems like a dirty but a working solution. But now, i found this: [quote]/** The filter can call this when something (apart from a parameter value) has changed.

    It sends a hint to the host that something like the program, number of parameters,
    etc, has changed, and that it should update itself.
*/
void updateHostDisplay();[/quote] ?? The number of parameters has changed? I thought that was fixed?[/*][/list]

Third: AudioProcessorGraph
The AudioProcessorGraph is an AudioProcessor itself, and as such implements all the parameter/programm stuff. Now, when the host calls an AudioProcessorGraph to save its current state as a binary blob, will it ask its internal AudioProcessors to do the same and then merge all the data together? Will it also safe the connections? What will happen, when the host calls the Graph to load a binary blob? It can’t re-create the nodes that were present at safe-time of that blob. What will it do?
The same problem with parameters. When asked for its parameters, will it return the sum of all its AudioProcessor’s parameters? If so, what happens when I add a Node (the number of parameters would change!)

Sorry for creating such a long post again…

StrangeMan

jrlanglois · September 6, 2012, 2:41pm

With all these questions, I’m surprised you haven’t tried writing some small internal processors yourself, and creating instances of 'em in the Plugin Host! I’m not sure I quite understand your scenario; could you please demonstrate it with a diagram of processors with all the connections?

So you have various objects (“modules”) that are intended to be filters, or instruments. Yes, these would need to derive from juce::AudioProcessor. It’s the only way they can exist in a world of processors (aka In a graph of processors… aka Inside a juce::AudioProcessorGraph). Therefore, what you’re saying is correct.

No, not at all. Like I said previously; audio boils down to audio samples - all audio processing is done in samples. Frame-based audio only exists where there is video. (Like in your favourite video editor). To reword what I said more simply; if you wanted code that deals with audio in the style of frames, you would have to write it yourself! Although I wouldn’t suggest it since it’s pretty well non-sense (I repeat; since everything audio in the digital world boils down to audio samples!). Such is only important when it comes to writing audio to a video file, or reading a video file that has audio.

Yes, I know (was looking to keep this simple firstly…), and yes, exactly. A host should hand over all the timeline information to a processor via a form of this object. The PPQ is the host’s current time, in ticks (instead of seconds… MIDI anyone?).

If I’m not mistaken, juce::AudioProcessor::processBlock is automatically and continuously polled if there are connections to the processor, and if the whole graph is connected to a device. Pretty difficult telling time in there since you still need a basis of some sort! Which means the only way your plugin can tell what time it is in the host’s timeline, is by having your plugin ask for it! [See here]

jrlanglois · September 6, 2012, 2:59pm

As for your later post, you should really look into how the Plugin Host saves and loads the filtergraph files, and what information is stored in one.

I haven’t personally gotten to dealing with presets, but I know that the processors are to deal with saving and recalling states internally. How else would a processor do that (I think it’s safe to assume that magic isn’t a valid option… :))?

That solution is pretty well the only solution. As for what you have pointed out; the comment Jules put on updateHostDisplay seems like a typo… but I could be wrong.

StrangeMan · September 6, 2012, 3:22pm

Hello jrlanglois,

Thank you for your answers. You asked for a more detailed description of my scenario: Well, it comes down to having multiple AudioProcessors, that can be connected to each other as well as to some input sources and some outputs in various ways. From your answers I think you got it just right.

[quote=“jrlanglois”]Like I said previously; audio boils down to audio samples - all audio processing is done in samples. Frame-based audio only exists where there is video.[/quote]Sorry, I guess we have a misunderstanding here. By Frame-based processing I meant calling a method similar to “processBlock” that processes a bunch of samples at once. By Sample-Based processing I meant something like the “tick” methods found in stk classes.

[quote=“jrlanglois”]Pretty difficult telling time in there since you still need a basis of some sort! [/quote] I don’t want to get an absolute time (e.g. 6th of september 2012 4pm) I’m interested in the ppq position for each sample. Let’s say, an AudioProcessor’s processBlock is called with a buffer of 100 samples. I can call the playhead to get the position in the hosts timeline. The playhead gives me exactly one ppq position - but I have 100 samples, each having their individual position in the timeline.
Now let’s say, I am trying to create a drum machine that aligns it’s drum-samples to the beat. When a new measure begins, I have to start playing back the audio for a drum exactly with the beginning of that measure - it’s first sample. That’s why it’s important to know, what the playhead returns as a ppq position.

[quote=“jrlanglois”]I haven’t personally gotten to dealing with presets, but I know that the processors are to deal with saving and recalling states internally. How else would a processor do that (I think it’s safe to assume that magic isn’t a valid option… :))?[/quote] The processor is not the only instance, that can edit it’s parameters. The host can set parameters of a plugin as well. It would be possible, that a preset is actually recalled by setting all those parameters to a specific value. The binary blob could still serve other purposes in this scenario (such as storing values for all “non-exported parameters”). Sorry, if that was unclear. I find this is hard to explain, but maybe you got the idea.

[quote=“jrlanglois”]As for your later post, you should really look into how the Plugin Host saves and loads the filtergraph files, and what information is stored in one.[/quote] Good idea, I forgot that the example for hosting could help me here.

Anyway, thank you for the answers that you gave me! Especially pointing me to the AudioProcessorGraph was great help.
StrangeMan

jrlanglois · September 6, 2012, 3:40pm

So to clarify by that logic, 1 frame is actually 1 run-through of processBlock. The “bunch of samples” would be all samples passed into processBlock, where the total number of samples is determined by the buffer size.

I haven’t looked into the STK stuff, so I don’t really know what you’re referring to in this case - sorry!

As I already explained, the host provides the time. And if I wasn’t clear; it’s not absolute, it’s based on the host’s timeline! AudioPlayHead::CurrentPositionInfo You can do all your logic related to its PPQ by calling the method I linked in the previous post. [See here]

StrangeMan · September 6, 2012, 3:51pm

Oh, how stupid. I did not read this carefully enough… Well, then it’s all clear.

The tick methods can be found in numerous stk classes and look like this float tick (float input)
So basically they take one sample, process it and return the result.

jrlanglois · September 6, 2012, 4:00pm

Ah I see.

So say you were designing an effect plugin, an analogous approach to “STK ticks” in processBlock would be to literally iterate through every sample from the AudioSampleBuffer object in the parameter, and “applying a desired effect” to them.

void MyCustomProcessor::processBlock (juce::AudioSampleBuffer& buffer, juce::MidiBuffer& midiMessages)
{
    //Assumes plugin is stereo
    float* channelLeft  = buffer.getSampleData (0);
    float* channelRight = buffer.getSampleData (1);

    for (int i = 0; i < buffer.getNumSamples(); ++i)
    {
        channelLeft[i]  = channelLeft[i] * 0.1f;    //Lowers the amplitude
        channelRight[i] = channelRight[i] * 0.1f;   //Lowers the amplitude
    }
}

StrangeMan · September 6, 2012, 4:19pm

sure, that’s what I meant in my first post:[quote=“I”]Frame-based processing can easily be disassembled into sample-based processing using a simple loop - without introducing delay.[/quote]
Alternatively, you can reverse the whole thing from tick to processBlock by introducing additional delay: With every call of tick, you fill a cell in a buffer, until the buffer is full. Then you call processBlock of some other component and handle the buffer to it. However, that will delay the processing for the length of the buffer.

Thank you for your help jrlanglois!

StrangeMan

Topic		Replies	Views
[DSP module discussion] Structure of audio plug-ins API Audio Plugins	50	7602	February 22, 2018
Where to load samples in AudioProcessor() Audio Plugins	0	331	March 12, 2013
ARGH! I don't like the way process processreplacing works General JUCE discussion	13	1679	May 12, 2017
Multichannel audio - approaches? General JUCE discussion	13	3560	November 23, 2019
Modular plugin... getting started Audio Plugins	11	1114	January 14, 2014

Getting started with audio processing

Purchase

Discover

Learn

Support

About

Events

Getting started with audio processing

Related topics

Purchase

Discover

Learn

Support

About

Events