Optimisations for RenderingOpsSequenceCalculator


#1

Hi Jules and everyone,  I've attached a version of the AudioProcessorGraph source file with a new method for sorting nodes and made modifications to the calculator code accordingly.  On my machine (i7-3770k) I can process around 5000 nodes with about 10,000 connections in roughly 35ms.  With my graph arrangement (basic DAW layout) that roughly equates to about 350 tracks with 6 inserts per.  It's much faster than I need it to be. 

I've attached the code for anyone that would like to give it a test.  If anyone has any feedback on the approach or C++'ness of the code it would be greatly appreciated.

Jules if you're interested in using this code in the library, or parts of it or whatever, please feel free.

Notables:

  • I'm not a performance guru.  Some information is cached and the sorting routine is reasonable.  No fancy tricks.
  • Tested fairly well with a number of branching configurations.  Latencies, detecting feedback loops (sorry still not supported), comparisons of stock calculator statistics.  Generally produces the same output as the stock graph.  It has a strict sorting routine so there are times when it produces very slightly leaner output.  It does a bit worse if you have a node with no connections as it adds them at the beginning vs typically the end with the stock graph.  That said, variances on either side appear to be negligible.
  • No changes to external apis or structures
  • New class GraphMap replaces the ConnectionLookupTable
  • New classes MapNode and MapNodeConnection tie Node and Connection information together
  • RenderingOpsSequenceCalculator::createRenderingOpsForNode is modified to work with MapNodes and MapNodeConnections and now has an internal struct called ChannelBufferInfo to facilitate the buffer reuse list.
  • Feedback loops are still not supported

That's about all I can think of at the moment.  Please feel free to give it a try, you should be able to just drop it into your code base with no changes required.

 

 


#2

Thanks for sharing! I'm crazy-busy right now so don't have time to sanity-check it, but will make a note to do so when I get chance!


#3

Maybe somebody wants to have a look on my Multithreaded version, its working in a special case productive environment (but it has no midi-support yet, and no feedback detection)

I don't give up that someday something like this will be in the went into the official juce repo :-) 

I thought a while about the Sequence Ordering, but finally it makes no sense (in most cases) for multithread situations, for feedback detection of

cause its needed.

Link:

http://www.juce.com/forum/topic/multithreaded-audioprocessorgraph-source-code

 


#4

Looks good! Had to fix up some VS warnings in there, and cleared up spurious white-space/trimmed line-ends.

 


#5

Thanks!  That's great to hear and thanks for addressing the warnings and tidying it up!

I noticed in this new version that there is an out of order initialization warning for ChannelBufferInfo so I've updated that.  I've also tweaked the MapNodeConnection ownership so that source and destination nodes both point to the same connection instance.  It doesn't seem to have much performance effect on my machine but no point in double allocating it.  As a result the GraphMap now owns all the MapNodeConnections and the MapNodes just use regular arrays.

The latest version is attached.

 


#6

You're very welcome Jules.  I'm glad to help out where I can.  If you have any questions about the code or would like to see some things changed please feel free to get in touch!


#7

A couple extra things to mention

  • There appears to be a potential bug in stock graph code regarding the totalLatency value. (Note it is *not* fixed in the this new version)  It considers the last node to be processed with zero outputs to be the determining total latency factor but this is not gauranteed to be the graph's audio output node.  If a graph has separate paths that run to different end points this value has the potential to be reported incorrectly.
  • Rail Jon Rogut helped me test my initial optimisation attempts ( thanks Rail btw! ) and he mentioned that when testing with large node numbers the duplicate processor check in addNode can slow things down.  As such it may be worth providing an option in addNode to bypass the duplicate check for people who have strict graph layouts.  Or perhaps a separate "addNodeUnchecked" method.  I don't have this issue myself as I run a modified version of the graph which has a number of changes including the removal of the duplicate check.

#8

Right,

I had suggested:

AudioProcessorGraph::Node* AudioProcessorGraph::addNode (AudioProcessor* const newProcessor, uint32 nodeId, bool checkForDuplicate /* = true */)
{
    if (newProcessor == nullptr || newProcessor == this)
    {
        jassertfalse;
        return nullptr;
    }

    if (checkForDuplicate)
        {
        for (int i = nodes.size(); --i >= 0;)
            {
            if (nodes.getUnchecked(i)->getProcessor() == newProcessor)
                {
                jassertfalse; // Cannot add the same object to the graph twice!
                return nullptr;
                }
            }
        }

with the third parameter which is set to default to true to keep backward compatibility. In my usage there’s no chance of a duplicate, so the check is wasted processing.

Thanks,

Rail


#9

I’m getting a similar assertion as in your very first version from the original thread…

                    //this seems to hit if a feedback loop was detected in the sorting routine
                    jassert(sourceBuffer != nullptr);

It thinks there’s a feedback loop when there isn’t.

Performance is great, the only bottleneck is disconnectNode() where it does a linear search.

Cheers,

Rail


#10

Thanks Rail, it turns it's an incorrect assertion so I've removed it.  I'll post the updated code to the end of the main thread.  disconnectNode is outside of the scope of the work I've done so I can't help much there.  Although based on the information you sent me I think it would be better to call "clear" on the graph in that scenario.


#11

Latest version of the code with a minor change to remove an incorrect assertion.  File is attached.


#12

It would be nice to see this in a fork to monitor progress and list bugs (using GitHub's Issues); it would be a perfect opportunity for those interested to provide direct feedback and patches. Doing so will also avoid any copy/pasta and attachments. :)


#13

Occurs right on Plugin Host's start-up, with default/original nodes. Index is always the same on start-up.

See attached image.


#14

hmm, I'm not seeing that here.  Maybe a silly question but has the member initialized by the time your breakpoint has hit?  (ie.  Is the value of midiBufferToUse_ negative?)


#15

The breakpoint is set at line 213. "midiBufferToUse_" is indeed negative.


#16

Ok, I've noticed that I missed wrapping an else condition in getFreeBuffer, although looking at the code it looks like it should still operate effectively.  Regardless, I'll fix that as it's ugly if nothing else.  (btw yes I agree with your other post regarding github, this attaching updates to posts is a bit hokey)

If that's not the problem, then it's pretty straightforward to debug.  Each call to createRenderOpsForNode ends with new'ing a ProcessBufferOp.  If you put a breakpoint at line 817 on startup with no connections it should always go into the first "if" condition to get the midiBufferToUse:

       /*line 817*/ if (midiSourceConnections.size() == 0)

        {

            // No midi inputs..

            midiBufferToUse = getFreeBuffer (true); // need to pick a buffer even if the processor doesn't use midi


            if (mapNode->getNode()->getProcessor()->acceptsMidi() || mapNode->getNode()->getProcessor()->producesMidi())

                renderingOps.add (new ClearMidiBufferOp (midiBufferToUse->index));

        }

 


#17

whoops, looks like *you* removed the else wrapper.  I suppose it was warning about no return value?  Anyway, I'll leave it as is!


#18

btw, I'll take back the "ugly" remark, your change is better than it was.  Although I'd rather have two methods for that, getFreeAudioBuffer and getFreeMidiBuffer... anyway, probably not important.


#19
  • Fix:  Added an additional check to ensure when adding destination nodes that the node isn't already in the "nodesToSort" list (renamed from startNodes to nodesToSort).  This bug can cause audio rendering issues and anyone using this code should definitely update.
  • Clean up: Removed an unecessary if
  • Clean up: Change hard coded feedback limit check to be const class member and upped limit to 50,000
  • Additional debugging Info: Added a some additional sanity asserts to the sort routine and additional comments

Latest version is attached.


#20

Hey Graeme,

This version seems to be working well here….

Adding Nodes and Connections is very fast here now.

I’ll have to figure out a way to improve the removeNode()/disconnectNode() methods 'cause now deleting a track is very slow in the original code.

Thanks!

Rail