SIMD Register size

nammick · June 2, 2018, 8:36pm

Is there any way of checking if the SIMD register size is 128bit or if AVX 256bit is supported programatically?

jonathonracz · June 3, 2018, 1:03am

Use SystemStats::hasAVX() (or any similar methods) to check for CPU features at runtime.

https://docs.juce.com/master/classSystemStats.html#ac8b5ff1c9505f12bca684fce44f514b1

For compile-time checking, there’s no standard cross-compiler, cross-platform way of checking CPU features that I’m aware of, and I don’t think JUCE has one either.

nammick · June 3, 2018, 5:10am

Thanks, I was hoping for a method like that. Not bothered about compile time. Just wanted a way of dividing up work evenly for y number of filters across x number of cores…

tytel · June 3, 2018, 5:39pm

SIMD is not about number of cores. It’s a Single Instruction that runs on Multiple Data. Since it’s a single instruction it runs on a single core.

nammick · June 3, 2018, 8:33pm

Very aware of that… But I want to run essentially 64 x 512 tap FIR filters across 8 cores and parallel the main multiplication and sum of an FIR. If I was to try this just in the audio thread it would keel over hence I’m trying to find a solution…

See my response below for my current benchmarks

nammick · June 3, 2018, 8:36pm

I have created this FIR class in which I interleave the tap coefficients… Now this works well on my MacBook pro and I7 desktop. will this optimisation always work if comp has AVX? - the example will optimise well if the size or number of FIR filters is a power of 4…

class SIMDFir {
public:

int numTaps;
int numDomains;
int size;
int width;

float * taps;

SIMDFir(const int _size, float * _interleavedTaps, const int _numTaps) {
    
    numTaps = _numTaps;
    size = _size;
    taps = _interleavedTaps;
    width = numTaps * size;
}
~SIMDFir() {};

inline void process(const float * interleavedIn, float * interleavedOut, const int numSamples) {

    for( int s = 0; s < numSamples; s++ ) {
        
        int sampleOffset = ( size * s );
        
        const float * inSamples = &interleavedIn[ sampleOffset ];
        float * outSamples = &interleavedOut[ sampleOffset ];
            
        for( int t = 0; t < width; t = t + size ) {

            #pragma simd
            for( int i = 0; i < size; i++ ) {
                
                // hopefully get speed boost here as mem access should be uni-stride
                // and memory should be aligned if size is power of 4
                // pragma simd should also hint if intel compiler?
                
                outSamples[ i ] += inSamples[ t + i ] * taps[ t + i ];
            }
        }
    }
}

};

For a file with 4.5 seconds of audio
1 FIR filters takes 206107 ns
2 FIR filters takes 219708 ns
4 FIR filters takes 218369 ns
8 FIR filters takes 297672 ns

So clearly getting vectorised and vastly increasing processing power

tytel · June 3, 2018, 8:47pm

If you compile with SSE2 code generation enabled, then it will work on machines with AVX. If you compile with AVX code generation, it won’t work unless the computer has AVX.

You can have SSE2 and AVX in the same binary but there’s a costly hit to switch back and forth. I haven’t done this myself so I’m not sure.

Also, why not use the juce::dsp::SIMDRegister class? It will automatically use SSE2, AVX2, or NEON depending on your compile settings.

nammick · June 3, 2018, 8:51pm

I’m a stickler for knowing exactly whats going on under the hood and I don’t quite get how that class works. I have been playing around with it the last couple of days but can’t wield it well enough yet - It’s a short coming on my part.

On the flip side though I did want to learn how to code in a way that takes advantage of this kind of optimisation and it’s much clearer now that I have managed to refactor some of my own DSP classes.

Topic		Replies	Views
How to support different SIMD architectures? General JUCE discussion	17	644	April 9, 2024
Using SIMDRegister for IIR filter processSample General JUCE discussion	5	653	February 26, 2020
SIMDRegister is it worth it? General JUCE discussion	6	2084	November 4, 2022
How portable is JUCE SIMD? General JUCE discussion	1	506	March 26, 2023
SIMDRegister - feedback and questions General JUCE discussion	5	775	June 2, 2021

SIMD Register size

Purchase

Discover

Learn

Support

About

Events

SIMD Register size

Related topics

Purchase

Discover

Learn

Support

About

Events