I have found that the AudioProcessorGraph quickly slows down to an unusable point when doing graph manipulations with more than just a few nodes and connections. It happens in buildRenderingSequence and I think isAnInputTo is a big part of it. It appears that the manner in which the nodes are connected can have a major impact on it. I have verified this behaviour with the Plugin Host example and have attached a couple examples.
This first image shows a graph that takes just over 2 seconds to create or remove a single connection. It has 23 nodes and 42 connections.
In this second one, I have added just 2 more nodes (on the left) and the time required to add a single connection between them has jumped to over 8 seconds.
In my own application when using 100 nodes and 190 connections, arming a track is currently taking 25 seconds, so I am very interested to hear what you think about the situation.
Hmm, Iāve never really tried using that class with more than a few effects, so not hugely surprised that it might not scale linearly! The algorithm it uses to sort the nodes into the correct rendering order certainly looks like itād scale badly, and would need some extreme cleverness to convert it from its current naive implementation to a more efficient algorithm. Interesting problem!
Thereās only one weak spot in it, where it has to create a sorted list of all the nodes in the graph, and Iāve implemented it in a really simple way - Iām sure there are plenty of better ways to do that operation. Might involve some reading-up about graph traversal algorithms.
Thanks for taking a look at this Jules. Iām dead in the water until a fix or workaround is found so please keep me posted. Iāll try to look at it more myself but my current lack of expertise gets in the way of being more helpful.
@TheVinn, sorry I donāt think that would be so simple for me to do. You could fire up the Plugin Host example and load up it up like in the examples? Only takes a few minutes and you can save the configuration so it can be loaded up again.
I think Jules was giving advice on how someone else might tune the class rather than doing it himself Graeme.
Vincent, just load the plugin host and you can load any number of VSTs or other plugins and connect them up. You can save it then so you donāt have to do it every time.
The problem is known though. If you look at the graph class, you can see that the logic for sorting the list will be slow if the number of nodes is high. If there was a way to incrementally modify the graph state whenever you needed to add or remove a node, that would fix it but I dunno if thats possible.
Well canāt you just write a simple test program? I need something that doesnāt require user interaction that is repeatable.
Make a small Juce app that creates a bunch of nodes and connects them randomly (using juce::Random with a known seed so the results are repeatable). Hereās an example of a test app:
Vinn, I do appreciate the offer to help look at this but Iām not even sure how to load a plugin without specifying which one to load. So which plugins do you have? What platform are you on? Is it possible to load an AudioProcessor without an actual plugin instance?
[quote=āAnimaā]
I think Jules was giving advice on how someone else might tune the class rather than doing it himself Graeme. [/quote]
Thanks Anima. I trust that Jules is capable of articulating his own position on things and as such feel that my response was and still is appropriate. I did mention that I would try to look further into it, but Iād rather let the people who really know what theyāre doing take care of what they are best at. What good is it for me to start blasting a bunch of ignorant guesses around? You can be rest assured that I am following along, learning as much as I can as I go, and that I feel inclined to be as helpful as I can, whenever possible.
Okay, you said that the AudioProcessGraph operations are slow, so Iām assuming you are referring to the scaffolding code (i.e. the maintenance of connection) and not the actual sample processing.
If this is the case, then you can make a simple test program that creates an AudioProcessorGraph, and add some ādummyā nodes to it. A dummy node is a subclass of AudioProcessor that basically does nothing, and is not tied to any plugin. You can just provide simple implementations for the necessary pure virtual functions (for example, a processBlock() that does nothing).
The resulting test program would compile and work on any platform. It doesnāt actually process any audio data. With this program we could establish a benchmark of performance, identify where the hotspots are (vTune amplifier) and then speak in concrete terms on what needs to be improved.
Alright, I tweaked your example code so that it calls prepareToPlay, which builds the rendering sequence (I think this is the time consuming part). Here is the modified code, it is stand alone that runs the test a few times and shows the total time:
#include "juce.h"
class BlankProcessor : public AudioProcessor
{
public:
//==============================================================================
BlankProcessor() {}
~BlankProcessor() {}
//==============================================================================
void prepareToPlay (double sampleRate, int samplesPerBlock) {}
void releaseResources() {}
void processBlock (AudioSampleBuffer& buffer, MidiBuffer& midiMessages)
{
buffer.clear();
}
//==============================================================================
AudioProcessorEditor* createEditor() { return 0; }
bool hasEditor() const { return false; }
//==============================================================================
const String getName() const { return String::empty; }
int getNumParameters() { return 0; }
float getParameter (int index) { return 0.0f; }
void setParameter (int index, float newValue) {}
const String getParameterName (int index) { return String::empty; }
const String getParameterText (int index) { return String::empty; }
const String getInputChannelName (int channelIndex) const { return String::empty; }
const String getOutputChannelName (int channelIndex) const { return String::empty; }
bool isInputChannelStereoPair (int index) const { return true; }
bool isOutputChannelStereoPair (int index) const { return true; }
bool acceptsMidi() const { return true; }
bool producesMidi() const { return true; }
//==============================================================================
int getNumPrograms() { return 1; }
int getCurrentProgram() { return 0; }
void setCurrentProgram (int index) {}
const String getProgramName (int index) { return String::empty; }
void changeProgramName (int index, const String& newName) {}
//==============================================================================
void getStateInformation (MemoryBlock& destData) {}
void setStateInformation (const void* data, int sizeInBytes) {}
};
class AudioGraphTest
{
public:
AudioGraphTest()
: graph( new AudioProcessorGraph )
{
graph->setPlayConfigDetails( 2,
2,
44100, 1024 );
graph->prepareToPlay( 44100, 1024 );
}
~AudioGraphTest() {}
void populateGraph( int numNodes, int numConnections, int64 seed )
{
Random rand( seed );
int src = 0, dst = 0;
/** Generate nodes */
++numNodes;
for( int i = numNodes; --i >= 1; )
{
graph->addNode( new BlankProcessor, i );
}
/** Add some connections */
for( int i = numConnections; --i >= 0; )
{
src = rand.nextInt( numNodes );
dst = rand.nextInt( numNodes );
/** Check that we don't add a connection to the same node */
while( src == dst )
dst = rand.nextInt( numNodes );
graph->addConnection( src, 0, dst, 0 );
graph->addConnection( src, 1, dst, 1 );
}
/** Make sure all connections are valid */
graph->removeIllegalConnections();
}
void reset()
{
graph->clear();
}
void prepareToPlay ()
{
graph->prepareToPlay (44100, 1024);
}
private:
ScopedPointer<AudioProcessorGraph> graph;
};
//------------------------------------------------------------------------------
template <class T>
struct Stats
{
Stats (int count, const T* data)
{
n = count;
lo = data[0];
hi = data[0];
sum = 0;
for (int i = 0; i < count; ++i)
{
T v = data[i];
lo = std::min (lo, v);
hi = std::max (lo, v);
sum += v;
}
mean = double (sum) / count;
double totvarsquared = 0;
for (int i = 0; i < count; ++i)
{
T v = data[i];
const double var = v - mean;
totvarsquared += var * var;
}
dev = sqrt (totvarsquared / (n - 1));
}
int n;
double dev;
double mean;
T lo;
T hi;
T sum;
};
//------------------------------------------------------------------------------
static void print (const String& s)
{
std::wcout << s << std::endl;
}
static void run (void (*test)(), const char* name)
{
const int ntrial = 5;
int64 trial[ntrial];
int64 total;
Time startTime = Time::getCurrentTime();
for (int i = 0; i < ntrial; ++i)
{
Time startTime = Time::getCurrentTime();
test ();
RelativeTime elapsedTime = Time::getCurrentTime() - startTime;
trial [i] = elapsedTime.inMilliseconds ();
int64 ms = elapsedTime.inMilliseconds ();
String s;
s << name << " #" << String (i+1) << " " <<
String (ms) << " ms";
print (s);
}
RelativeTime elapsedTime = Time::getCurrentTime() - startTime;
total = elapsedTime.inMilliseconds ();
Stats<int64> stats (ntrial, trial);
String s;
int64 ms = elapsedTime.inMilliseconds ();
s << name << " " <<
String (stats.sum) << " total, " <<
String (stats.mean, 0) << " avg, " <<
String (stats.dev, 3) << " dev";
print (s);
}
//------------------------------------------------------------------------------
void test1 ()
{
AudioGraphTest agt;
agt.populateGraph (4096, 65536, 1);
agt.prepareToPlay ();
}
int main (int, char**)
{
ScopedJuceInitialiser_GUI jucelib;
run (test1, "AudioGraphTest");
return 0;
}
Honestly for the amount of nodes and connections I have (which is HUGE) this does not seem slow.
An obvious optimization would be a constant-time implementation for getNodeForId(), either with a hash map or by changing nodeIds to be small integers that can be used as array indexes (although this might not be possible for client-provided nodeIds). Such a change would almost double the speed.
I think you should check the state of the graph before you run the test Vinn, would be good to know how many connections and nodes are definitely there. Might need to tweak the āadding connections codeā if most of them are not getting added correctly.
Interesting with the getNodeForID method, definitely room for tweaking.
If you look at my first post you will see that I provided a configuration with 23 nodes and 43 connections and under this configuration it takes 8 seconds to to manipulate 1 connection.