Someone else said the same thing, which is bizarre… Are you perhaps using an old MS compiler? The only possible thing I can think of is that the const-ness of the “num” parameter in the cpp file is triggering a compiler bug where it incorrectly differentiates const/non-const parameters. Does it work if you change the cpp like this?
void FloatVectorOperations::clear (float* dest, int num) noexcept
{
Someone else said the same thing, which is bizarre… Are you perhaps using an old MS compiler? The only possible thing I can think of is that the const-ness of the “num” parameter in the cpp file is triggering a compiler bug where it incorrectly differentiates const/non-const parameters. Does it work if you change the cpp like this?
void FloatVectorOperations::clear (float* dest, int num) noexcept
{
[/quote]
I’m using VS2008 (not that gives me any joy–I’m stuck with it because of another tool I have to use). Removing the const didn’t accomplish anything. Could this be related to the fact that a number of the RTAS wrapper files must be declared stdcall instead of cdecl?
I also ran into a link issue with the call to the static method ‘AudioProcessor::setTypeOfNextNewPlugin’ in juce_PluginUtilities.cpp , turned to be caused by the ‘/Gz’ visual c++ flag that I was using for compiling the rtas wrapper code, and not using this flag for the rest of the code. Adding a JUCE_CALLTYPE in the declaration of setTypeOfNextNewPlugin solved it. Maybe this is the same issue ?
I have a question : what is the best/most elegant way to declare a float vector with JUCE, to be sure it has its contents aligned, and to use it with the class FloatVectorOperations ? I have used a few times the aligned_malloc and the __m128 type before, but I imagine there is another way to do that in JUCE, if we want to use this new class, the parameters of the functions being simple float types…
It should be cross-platform seeing that it’s just an instruction set; AMD have supported SSE for some time iirc, and afaik ARM and Itanium support SSE, too.
Edit: Though I wouldn’t count on every manufacturer other than Intel supporting >SSE2!
I’m messing around with a profiling app trying to get a feel for the potential benefits of these functions, but I’m not sure if I should be setting a flag or something, because in my code, the following . . .
Only reports that I’m using apple equipment, and not vDSP (unless I’m fundamentally misunderstanding something). I’ve also been messing about with posix_memalign, using wacky alignments to see if I get a drop in performance, which I would expect, but do not see. Do I have to do anything special to enable the vector ops?
@Williamk (on POSIX)
posix_memalign( (void**)&heapData, 16, sizeof(float)numel);
…where heapData is a float
JUCE_USE_VDSP_FRAMEWORK is used internally. You can set it yourself if you need to override the default behaviour, but there’s no point in checking it in your own code.
If you use any mainstream new or malloc function to allocate a block whose size is a multiple of 8, then you can be pretty damn confident that what you get back will be aligned to at least 8 bytes. Any allocator that didn’t do that would be very unusual.
SSE aligned load store required 16 byte alignment AFAIK.
You’re safe on mac not on windows though where you need to use _aligned_malloc and _aligned_free
Can I request a convertFloatToFixed() (_mm_cvtps_epi32) and maybe a floor() function?
The floor function isn't a requirement since it can be approximated by simply casting to int and back again once I have the float to fixed conversion at hand, but could be nice to see in there :)