FloatVectorOperations

jules · February 25, 2013, 2:43pm

Just added a new class: FloatVectorOperations

I needed some accelerated primitives for tracktion, so did a nice class to wrap up some simple SSE2 ops, and used them to speed up various bits of audio buffer code. Testing + suggestions welcome!

Exponential_Audo · February 25, 2013, 2:53pm

Thanks Jules! I just noticed those had been checked in. I wonder if you might add a routine to fill a vector with a constant. Lots of times we’ve got to fight denormalization errors, and we’ll typically clear buffers by setting them to a small constant (inaudible) value. That often keeps plugins from pigging out on CPU time.

But it looks like a helpful class. I’m sure you have many suggestions for additions.

jrlanglois · February 25, 2013, 3:01pm

Too excited to try this out!

Just a note, it doesn’t build in 64-bit:

(Plus an x64 warning)

jules · February 25, 2013, 3:07pm

Good idea - I could use that myself, actually. What value do you use? I was thinking of trying alternating small +ve/-ve values, so it remains centred at 0.

Exponential_Audo · February 25, 2013, 3:33pm

Good idea - I could use that myself, actually. What value do you use? I was thinking of trying alternating small +ve/-ve values, so it remains centred at 0.[/quote]
Much of the time I use 0.000000045f, but there’s nothing unique about it. I’d suggest simply making the value an argument to the function. I’d avoid alternating values because they do introduce a frequency component. It’s very rare that these constants would actually add up to enough DC to be problematic, but that depends on the algorithm. Any sort of IIR operation can easily do a sign flip if it matters.

EDIT: I’d also request a function to add a constant to a vector. Lots of times you don’t know if the vector has any values or not, so you’d do this as an occasional safety measure.

jules · February 25, 2013, 4:25pm

Thanks - fixed the 64-bit stuff, and added a fill function now…

OBO · February 25, 2013, 7:59pm

Nice addition, but what about ARM/NEON?

I have been looking for good, simple to include, simd wrapper libraries and this one is suberb:

otristan · February 26, 2013, 11:34am

IMHO on OSX and IOS, you should implement those using vDSP when there is an equivalent

copy --> vDSP_mmov
copyWithMultiply --> vDSP_vmul
add --> vDSP_vsadd and vDSP_vadd
multiply --> vDSP_vsmul and vDSP_vmul
clear --> vDSP_vclr

jules · February 26, 2013, 2:08pm

[quote=“otristan”]IMHO on OSX and IOS, you should implement those using vDSP when there is an equivalent

copy → vDSP_mmov
copyWithMultiply → vDSP_vmul
add → vDSP_vsadd and vDSP_vadd
multiply → vDSP_vsmul and vDSP_vmul
clear → vDSP_vclr[/quote]

Yes, that was on my to-do list. Will be there soon…

IvanC · February 26, 2013, 4:11pm

Very good idea !

CPB · February 26, 2013, 5:11pm

Anyone else having problems building an RTAS plug-in on Windows since these were added? When building the JuceDemoPlugin with Visual Studio 2010 and the Pro Tools 8 SDK, I get the following linker errors:

juce_RTAS_Wrapper.obj : error LNK2001: unresolved external symbol "public: static void __stdcall juce::FloatVectorOperations::copy(float *,float const *,int)" (?copy@FloatVectorOperations@juce@@SGXPAMPBMH@Z) juce_RTAS_Wrapper.obj : error LNK2001: unresolved external symbol "public: static void __stdcall juce::FloatVectorOperations::clear(float *,int)" (?clear@FloatVectorOperations@juce@@SGXPAMH@Z)

jules · February 26, 2013, 5:20pm

Try resaving your project with the introjucer - it may need to include the new source files.

CPB · February 26, 2013, 6:19pm

I did do that; no luck, I’m afraid. The juce_FloatVectorOperations.cpp is being included in juce_audio_basics.cpp, but the linker can’t seem to find it.

lkjb · February 26, 2013, 7:55pm

Regarding denormalization, you can get that for free when using SSE. If the FTZ (flush to zero) and DNZ (denormals are zero) bits are set in the SSE control register the CPU won’t bother with denormals. One of those bits came with SSE2, not sure how this behaves with SSE.

I’m using the followin stuff to avoid denormalisation in PitchedDelay:

class ScopedSSECSR
{
public:
        ScopedSSECSR()
                : csr(_mm_getcsr())
        {
                // sets FTZ & DNZ
                _mm_setcsr(csr | 0x8040);
        }

        ~ScopedSSECSR()
        {
                // resets control register
                _mm_setcsr(csr);
        }
private:
        const unsigned int csr;
        JUCE_DECLARE_NON_COPYABLE_WITH_LEAK_DETECTOR(ScopedSSECSR);
};

...

void processBlock(AudioSampleBuffer& buffer, ...)
{
        ScopedSSECSR csr;
        // processing stuff
}

jules · February 26, 2013, 8:45pm

[quote=“lkjb”]Regarding denormalization, you can get that for free when using SSE. If the FTZ (flush to zero) and DNZ (denormals are zero) bits are set in the SSE control register the CPU won’t bother with denormals. One of those bits came with SSE2, not sure how this behaves with SSE.

I’m using the followin stuff to avoid denormalisation in PitchedDelay:

[code]
class ScopedSSECSR
{
public:
ScopedSSECSR()
: csr(_mm_getcsr())
{
// sets FTZ & DNZ
_mm_setcsr(csr | 0x8040);
}

    ~ScopedSSECSR()
    {
            // resets control register
            _mm_setcsr(csr);
    }

private:
const unsigned int csr;
JUCE_DECLARE_NON_COPYABLE_WITH_LEAK_DETECTOR(ScopedSSECSR);
};

…

void processBlock(AudioSampleBuffer& buffer, …)
{
ScopedSSECSR csr;
// processing stuff
}
[/code][/quote]

That’s a nice trick. So if any denormalised numbers in the input data always get flushed to zero?

Exponential_Audo · February 26, 2013, 8:57pm

[quote=“jules”][quote=“lkjb”]If the FTZ (flush to zero) and DNZ (denormals are zero) bits are set in the SSE control register the CPU won’t bother with denormals. One of those bits came with SSE2, not sure how this behaves with SSE.
[/quote][/quote]

I was just scrounging around the Mac developer site and found this procedure recommended. It sounds like a dumb question, but do you know if that register is saved in the thread context? If I decided to turn it off for an extended series of operations, can I be certain it doesn’t get whacked if I’m swapped out of the CPU for a bit?

jules · February 26, 2013, 9:12pm

Good question. And if it does stay constant for a thread, then maybe a smart thing to do would be to just leave it permanently enabled for your audio thread?

jpo · February 26, 2013, 11:44pm

[quote] One of those bits came with SSE2, not sure how this behaves with SSE.
[/quote]
you get a crash when setting this bit.

those bits are saved in the thread context, just like the x87 fpu state – as a consequence they are also shared by the plugins who happen to be run on the same audio callback, this can have (not so) funny consequences when one of the plugins does nasty things with them

Exponential_Audo · February 26, 2013, 11:58pm

So it sounds like you’re OK if you set it up when you enter your process and restore it when you leave.

Exponential_Audo · February 27, 2013, 4:14am

I’ve just tried to build a Windows RTAS plugin (an operation I’ve done many times). I had a link failure with missing symbols FloatVectorOperations::copy and FloatVectorOperations::clear. Is there something that should have been included in one of your modules?

Topic		Replies	Views
DoubleVectorOperations General JUCE discussion	5	1041	July 31, 2013
No performance improvement with FloatVectorOperations General JUCE discussion	42	4732	March 12, 2024
Errors when trying to compile a 64bits plug-in with Intel compiler Windows	8	1266	February 11, 2015
FloatVectorOperations crash General JUCE discussion	13	961	November 9, 2013
FloatVectorOperations performance on Windows Windows	7	1376	April 20, 2015

FloatVectorOperations

Purchase

Discover

Learn

Support

About

Events

FloatVectorOperations

Related topics

Purchase

Discover

Learn

Support

About

Events