Resolving denormal floats once and for all

Hi,
Who of you have not struggled with denormalized floating point numbers? I hate them, for many reasons. When I started doing audio dsp I stumbled upon it at once, but what I didn’t realize right away was that not all platforms suffers from the phenomenon as much as the x86 platform. In fact, I am pretty sure that the old PPC didn’t even slow down. How it is in the ARM platform I don’t know. x64? SIMD? Various DSP’s?

I have stumbled upon several implementations for un-denormalization, some which are faster than others.

I have also stumbled upon programmers who have no idea when to expect denormalized numbers to appear, and thus try to “remove” them all the time, with an unnecessary performance penalty as a result.

As there are an disproportional amount of audio DSP programmers on this forum, I propose those who are interested put our heads together and come up with a set of detection techniques and remedies. This need not be a huge “project” but I’m pretty sure many out there don’t really master numerical stability. With a bit of luck jules may even include some code in juce if it’s good enough.

How about it?

1 Like

There is a JUCE_UNDENORMALISE macro afaik. Maybe try using that.

There is a “simple” solution to that for modern processors (all processor with 64 bits architecture)
BUT it forces you to use only SSE instruction to make your computations

2 flags of the MXCSR register can be set to deal with denormals / underflows
the DAZ (Denormals Are Zero) treat input of a floating point instruction as zero if it’s a denormal
the FTZ (Flush To Zero) Set the Denormal result to zero

and all problems should go away

http://software.intel.com/en-us/articles/x87-and-sse-floating-point-assists-in-ia-32-flush-to-zero-ftz-and-denormals-are-zero-daz/

Have you tried it yourself? By the looks of it, it might to be optimized away by any decent compiler.

#define JUCE_UNDENORMALISE(x)   x += 1.0f; x -= 1.0f;

Also, a number as huge as 1.0 may very well destroy a number small enough.

[quote=“Chaotikmind”]There is a “simple” solution to that for modern processors (all processor with 64 bits architecture)
BUT it forces you to use only SSE instruction to make your computations
[/quote]Can one really force the compiler to do so? If I set a compilation switch to use SSE, can I be sure it will use nothing but?

The only obvious way i see to force the use of SSE is compiling for x64 (It would maybe only work on windows)
Edit: not completly sure of that, need some verifications

The only obvious way i see to force the use of SSE is compiling for x64 (It would maybe only work on windows)
Edit: not completly sure of that, need some verifications[/quote]
Yeah, I’m not 100% sure either. That’s the vague impression I’ve got.

It won’t help in 32 bit compiles though. I suspect that denormalized floats is a problem only in 32 bit Intel builds…

Hi there…

I’ve had to deal with this problem a few month ago : here’s what I did.

At the top of the files where I suspect denormals caused problems, I added that :

#include <assert.h>		// assert that also work in C
#include <xmmintrin.h>	// SSE instructions

#ifndef _WIN64 // x64 architecture has build in SSE2.
	#if _M_IX86_FP != 1 && _M_IX86_FP != 2
		#error SSE instructions must be enabled.
		//	This code will have terrible performance unless you compile it with
		//	SSE or SSE2. Please go to project properties for this file, in
		//	"C/C++ -> Code Generation -> Enable Enhanced Instruction Set" and
		//	select SSE or SSE2.
	#endif
#endif // _WIN64

#ifdef _DEBUG
	#ifdef CHECK_DENORMAL
		#error Oops... This macro seems to be used elsewhere!
	#endif
		static void CheckDenormal()
		{
			assert( _MM_GET_FLUSH_ZERO_MODE(0) == _MM_FLUSH_ZERO_ON );
			// If you get an assertion here, it's because the processing
			// thread (this one) is not flushing denormal numbers to zero.
			// This will cause the code in this file to be very slow.
			// Make sure the processing thread calls the following macro 
			// before arriving here. 
			//		"_MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);"
			//		(note: you'll need to #include <xmmintrin.h>)
		}

	#define CHECK_DENORMAL CheckDenormal()
#else // _DEBUG
	#define CHECK_DENORMAL // defined out of existence
#endif // _DEBUG

Then, in every method where I want to be sure that I don’t have denormals around, I add

CHECK_DENORMAL; // mind the final semicolon !

If the project is not set up properly, you will get compile errors with a nice comment telling you what to do.
Once you’ve manage to compile, if you actually have a denormal problem, you’ll get an assertion (in debug only, obviously), also with a nice comment telling you what to do.

A few more remarks :
[list]
[]The above code is windows C-code, not cross-platform C++, so maybe you will need to tweak it a bit until it works…[/]
[]The above code is only affecting the FTZ flag (flush to zero). Similar things can be done with the DAZ flag (denomal are zero), but I don’t know the exact syntax. Google can probaly help you with that, though…[/]
[]If you’re compiling for intel, using IPP, you may want to look up “ippSetFlushToZero” and “ippSetDenormAreZeros” in Google. (I’ve never managed to really make these work myself, so if you do, please let me know.)[/]
[]Keep in mind that denormal settings of one thread are independant of denormal settings of another thread. In other words : setting FTZ and DAZ from a constructor is pointless if you call your processing method from a different thread.[/]
[]The above code relies on your target platform supporting SSE or SSE2 instructions. For x86, it’s optionnal ; for x64, SSE2 is build in (you can’t turn it off - but why would you want to ?) Note that SSE instructions provide you with the FTZ and DAZ flags, but your code is responsible for setting them to the values you want. By default, denormals are taken into account.[/][/list]

Finally, if I may bring my two cents :

Not quite sure. If I’m not mistaken, a compiler is allowed to optimize things away only if it can prove that the result (the value of x, in this case) is the same, which is only true when denormal cannot occur. I’d expect a decent compiler to indeed optimize it away in only two cases :
[list]
[]if the target platform has no support for denormal[/]
[]if the compiler has been told explicitely to use some approximate arithmetic (aka fast floating point) which can be expected to treat denormal as zero.[/][/list]
In these two case, I want my compiler to optimize this away, so I think the fearless leader has been quite clever with this macro, after all.

But… to be fair, I haven’t tested it myself.

Hope it helps…
Cheers

Val

1 Like