Some assembly required


#1

What do people use for assembly? I’m doing some thick and chunky processing (multiband EQ and distortion for starters), and I know I could knock off some serious overhead per sample with assembly.

I’ve done assembly programming before, so user-friendly isn’t 100% critical, though what I did was old-school processors.

I’m saving up for the Juce licensing, so free or cheap is a plus, though I don’t mind paying for something that’s excessively useful in general.

Also what are the disadvantages? I’m not even sure if/how float is handled in assembly…

me.isSickOfSeeingCPUMeterOverValue(200) == true;

Dave Heinemann


#2

You don’t have to buy anything - all c++ compilers will let you stick some inline asm in there.

I hate writing asm myself, it’s a real pain, and you normally end up with something less efficient than what the compiler would have created from your C version…


#3

I like this little function from musicDSP. Fastest int to float function ever!

inline int truncate(float flt)
{
  int i;
  static const double half = 0.5f;
  _asm
  {
     fld flt
     fsub half
     fistp i
  }
  return i
}

#4

Is it faster than roundFloatToInt()?


#5

Yes… remember that we talked about this a while ago. I ran a test which came out:

Plain cast: 1421 ticks
Asm Truncation: 797 ticks
Juce Rounding (-0.5f to get a truncation rather than rounding): 953 ticks

Per 100 000 000 operations if I recall. Peanuts in the big scheme of things. (But if you use truncation to calculate interpolation values many times per sample it sure adds up.)


#6

Rock: Sadly your code is probably the fastest one, but it suffers from FIST bug. Please have a look here

Basically you have to add a FRNDINT instruction between fsub half and fistp.

If you have SSE2, then you can use only one instruction which is cvttsd2si
Note that your code depends on FPU state (and changing the state is slow), and cvttsd2si does a truncate.


#7

I’m sure it does, but it works ok in practice in the phase/ interpolation context.

I posted it just as an example of inlined assembly though, I have no idea what’s going on.


#8

I remembered reading something about the compiler not handling it itself, but I guess they failed to mention that the necessary tool is included with any compiler worth a free download…

me.isAlwaysMakingThingsMoreComplicated() == true;


#9