FloatVectorOperations crash

AaronLeese · September 14, 2013, 7:07pm

Hey Jules -

I love the new FloatVectorOperations class. I've been adding them everywhere I can lately.

However, I've just noticed that in a very particular circumstances they break.

Specifically - Win32 Release mode plugins.

Seriously - they operate fine on mac, or Win 64 bit, or in debug mode, in in any standalone program .... but not in Release mode, 32 bit, Windows. plugins.

You can replicate this just by doing a AudioSampleBuffer::addFrom() in the processBlock of the Juce plugin example (and the FloatVectorOperations::add inside there craps out). copyFrom works fine (FloatVecotrOperations::copy seems ok).

It's just the FloatVectorOperations::add and FloatVectorOperations::multiply that have issues (so really I think it must be something about the _mm_add_ps call in:


  JUCE_PERFORM_SSE_OP_SRC_DEST (dest[i] += src[i],
                                  _mm_add_ps (d, s),
                                  JUCE_LOAD_SRC_DEST, JUCE_INCREMENT_SRC_DEST)

Just for reference ... here is the code I'm using in the plugin example:


void JuceDemoPluginAudioProcessor::processBlock (AudioSampleBuffer& buffer, MidiBuffer& midiMessages)
{
    const int numSamples = buffer.getNumSamples();
    int channel, dp = 0;
    // Go through the incoming data, and apply our gain to it...
    for (channel = 0; channel < getNumInputChannels(); ++channel)
        buffer.applyGain (channel, 0, buffer.getNumSamples(), gain);


    AudioSampleBuffer temp(buffer.getNumChannels(), numSamples);
   
   // CAUSES a CRASH .... in win32 release mode
    for (short chan=0; chan<buffer.getNumChannels(); chan++)    
        buffer.addFrom(chan, 0, temp.getSampleData(chan), numSamples);

Ok - take a look when you get a minute.

jules · September 15, 2013, 6:06pm

Presumably this could only be because the host is changing some kind of CPU floating point mode setting that breaks some SSE operations... My knowledge of floating point modes is pretty shallow - anybody know what this might be?

jpo · September 16, 2013, 7:40am

probably not related , but I see that _mm_empty() is used. It is only to be used when you're mixing mmx instructions with fpu instructions -- Since you are requiring SSE2 instructions , mmx stuff is not used (it was only relevant before SSE2) and there is no point in using _mm_empty().

jules · September 16, 2013, 8:04am

Ah! I didn't know that - thanks, I'll remove it!

AaronLeese · October 23, 2013, 11:26pm

Jules -

I just checked in on this one and it still seems to be an issue.

This is no trivial matter, as I'm sure most plugins out there use addFrom() or clear() even if they don't use FloatVectorOperations::multiply directly.

It should be fairly easy to fix though (just putting a preprossessor switch in there that uses an iterator for any win32 plugin should do the trick, right?).

jules · October 28, 2013, 9:24am

I can't just treat the symptoms like that without actually knowing what's causing it.

Could it be because you're turning on some of the more extreme floating-point optimisation modes in the compiler? I know that if you start enabling the non-IEEE floating point compiler flags in MSVC it can lead to some pretty strange and unexpected bugs.

AaronLeese · October 28, 2013, 10:13pm

Doubtful. You can replicate the issue by adding an AddFrom() to the juce plugin example (and compiling for win32, Relase mode).

I'll go through and try it with some changes to the settings though. Since it happens in debug and not release, it certaibly could be an optimization setting.

jules · October 31, 2013, 9:29am

Sorry, I've not looked at this yet..

But which host are you using? Have you tried in something like the juce demo host, where we can be sure that it's not mucking-around with the CPU mode flags?

AaronLeese · November 5, 2013, 6:59pm

Yup. Got it.

Looks like turning off whole program optimization (VS2010 -> Project Properties -> C/C++ -> Optimization ) does the trick.

It has something to do with SSE_INTRINISICS alignment specifications.

Take a look at this:

http://stackoverflow.com/questions/12502071/sse-intrinsics-and-alignment

You may want to turn off the optimization on the plugin demo code too.

jules · November 6, 2013, 10:04am

Hmm.. I'm a little dubious about alignment being the issue - the way the SSE stuff is written, it'll be faster with aligned memory, but will work just fine with non-aligned data too. The only other thing that could be misaligned are the local __m128 variables, but that data-type is hard-wired to be 16-byte aligned, so even if there was a compiler bug that was messing it up, then there's nothing much we could do to the code to fix it.

Which version of VS are you using? It could actually be that this is a compiler bug which they've fixed.

AaronLeese · November 6, 2013, 3:23pm

VS 2010

I'll check for some updates.

And yeah, if it were alignment, you would also expect it to break on standalones (and not just the plugins).

It works fine with 64 bit plugins too.

Only an issue for 32 bit Release plugins with whole program optimization enabled .... very strange.

jules · November 6, 2013, 3:48pm

Hmm.

I wonder if the problem is because when a host makes a call into the plugin's process function, the *host* could sometimes do so with the stack in an unaligned state. Normally, there'll be code inside functions which checks the stack alignment and corrects it, but with WPO, the optimiser may have decided that these functions can only ever be called from places where the stack pointer is already aligned, so there's no need to bother checking it.. but obviously that wouldn't take into account the fact that it can also be called from an external app..

If that's the case, then it might be possible to fix by adding some kind of compiler-specific hack in the audio processing base method to force it to always fix the stack alignment before doing the plugin processing.

AaronLeese · November 8, 2013, 7:49pm

Woah ... deep shit man. Sounds possible though.

Happy to test it out if you have some tweaks you want to try.

jules · November 9, 2013, 1:03pm

TBH I have no idea how to force the compiler to do that kind of thing. Like you say, this is seriously deep compiler stuff, and I've never dug that deeply into how stack alignment works.

Topic		Replies	Views
FloatVectorOperations General JUCE discussion	39	3326	June 23, 2015
Errors when trying to compile a 64bits plug-in with Intel compiler Windows	8	1287	February 11, 2015
FloatVectorOperations performance on Windows Windows	7	1386	April 20, 2015
No performance improvement with FloatVectorOperations General JUCE discussion	42	4968	March 12, 2024
SSE optimization General JUCE discussion	4	993	February 5, 2017

FloatVectorOperations crash

Purchase

Discover

Learn

Support

About

Events

FloatVectorOperations crash

Related topics

Purchase

Discover

Learn

Support

About

Events