Not at the moment, but it’d be fairly easy to add. (It’d mean that people would have to add another method to their juce plugins to take advantage of it though).

Just wondering if support for doubles has been added? I couldn’t find anything to suggest it has, but I’m probably looking in the wrong places.

There is no point adding support for incoming numbers as double precision, and anyone that tells you otherwise is just trying to sell you something you don’t need. A floating point number is more precise than any analog to digital converter already. You may internally need to use a double precision number for summing and integrating state variables, but the input and output as float is way more than needed already.

I’m sorry to jump in 8 months later, but I disagree with andrew.

The fact that float has a bigger dynamic range than any ADC is irrelevant as distortion/noise is added with internal calculations, and as a float has dynamic range of 25 bits (150 dB), that distortion and noise are not as low as one might think.

If your plugin runs in double precision, when bit converting from double (64) to float (32), a proper plugin should add dither to prevent any truncating distortion. That dither can accumulate, or worse - be heavily boosted (some guitar amp plugins have such heavy limiting that any dither noise underneath would actually become highly noticeable during quite passages).

In addition to that, there is virtually no processing overhead with double on modern 64 bit machines (higher memory consumption though).

So there is some sense in having an audio path that is double throughout, and there is some sense implementing this VST feature.

Iznaki, I’m not sure where to begin.

So lets start with the basics, floating point numbers have a dynamic range from 2^128 to 2^-128, and 20 Log (10, 2^128 / 2^-127) = 1500 dB [1] (sorry I transposed the 1 and 5 previously to make it 5100 dB which is incorrect), and not 150 dB which is the figure you have stated. I am guessing what you meant to say was precision not dynamic range. So lets agree that a 32-bit float has somewhere from 24 to 25 bits of precision [2] (it depends on how you count the sign bit and the leading 1 which can never be zero so doesn’t really count as a full bit).

Ok, now lets turn to a real world signal being recorded. Even with the best acoustic environment, with the best mics, and the best converters you still have to deal with (unless you want to record at near absolute zero kelvin) Johnson Nyquist Noise, which is at the level of -124 dB at 100 kHz [3], which is around 21 bits, so already our single precision float is going great guns, not a problem thus far as you have a 3 or 4 bits of natural dither from actual thermal hiss. Our ears are also victims of physics, so the same thermal issues apply. In other words there is a limit to the dynamic range of human hearing that a single precision float also handles.

Now processing audio is a different matter, as I already pointed out, internally to an algorithm you may need to use double precision numbers, and if you really want to you could dither back to there (although it is non trivial) to single precision, but when passing information between devices single precision is fine.

Next up, processing overhead. With an intel CPU you can do four single precision floating point operations at once, vs 2 double precision, so already your processing power is halved if you use doubles. Then, if all your numbers are doubles, you halve how many numbers you can store in the inner most cache. With Analog Devices and Texas Instruments chips you have a similar situation, double precision computations take twice as long as single precision ones.

It is much easier to just search and replace “float” with “double” in your code and be done with it, but this is wasteful and lazy. If you can’t figure out where a float is needed and where a double is needed you are best off sticking with double, even horrible dsp deformities like a direct form 1 biquad work ok (well you still can’t modulate them) with double precision numbers, but please don’t brag about being lazy and by putting something on your web page saying “all 64-bit signal path”, since that says to me “I’m not smart enough to work out how to do things properly so I’m just going to waste your cpu and memory and try and convince you it’s in your benefit”.

[1] http://en.wikipedia.org/wiki/Dynamic_range

[2] http://en.wikipedia.org/wiki/Single-precision_floating-point_format#IEEE_754_single-precision_binary_floating-point_format:_binary32

[3] http://en.wikipedia.org/wiki/Johnson–Nyquist_noise#Noise_power_in_decibels

Hi Andrew,

So I’ve been into these sorts of debates loads. As someone who is teaching this and even has a small related publication, I hope I know a thing or two and would actually love you to prove me wrong. So I’m happy to keep this discussion going until we are both satisfied.

You are correct about a precision of 25 bits (23 Mantissa + 1 sign + 1 implied bit). Now I think we disagree on the definition of dynamic range, which is clearly a thing to not agree about.

If you compare floating point to fixed point you can say that floating point has a larger dynamic range (much higher and lower values can be represented). Indeed, articles on float vs fixed often use ‘dynamic range’ for the exponent and ‘precision’ for the mantissa. But if we talk about floating numbers representing audio, then we have to take into account that although we can represent very big and small values, the audio itself is confined to a much smaller precision, within a hugh range. This precision, in audio terms, equals to the actual dynamic range of the system.

It is true that the range of numbers provided by floating point can be bigger than 150 dB apart. If we speak in strict dBFS terms then yes - one sample can be 0 dBFS while another -700 dBFS (0 dBFS stand for the biggest number float can represent). But when you mix these two samples together your result is limited to 25 ‘meaningful’ bits (ie, bits that carry a calculation based sequence of 0s and 1s) - the 8 bit exponent just scales them into a range of numbers.

If you consider a delay line, where each echo is 6dB lower than the previous (imagined in binary - the mantissa will stay the same, but the exponent will go 1 down with each echo). The wet signal will actually dive more than 150 dB in level within the delay. But when mixed with the dry signal, you cannot have the result extending 25 bits below the peak. The peak of the dry signal sets the top, and anything 25 bits below it will have to either be rounded or lost. In technical terms, if you mix/sum these two binary samples:

1 0000 0000 0000 0000 0000 0000 (all of these 25 bits are the mantissa)

0. 0000 0000 0000 0000 0000 0000 0000 0000 1 (you can get such a number with the exponent)

There is simply not enough bits in the mantissa to represent the result and you’ll end up with the first number instead - your low level signal is lost. The two samples ARE more than 150 dB apart, but their sum isn’t!

You can give a similar example with a fader - if 1.0f is divided by 3, the result is limited to a 25 bit precision. This practically means rounding errors (distortion) at -150dB.

Although we can be a meter away from a 130 dBSPL source and we will clearly hear it (some would say too clear), as 120 dBSPL is the threshold of pain we say our ears has 120 dB of dynamic range. In the same way I believe there’s no point of talking about huge dynamic range if any processing of a sample results in a precision limit of 25 bits (150 dB).

Another argument would be the mathematical view - there is nothing in Nyquist’s theorem that relates to dynamic range (and some people say that scientifically speaking digital audio has unlimited dynamic range). But we clearly get (rounding/truncating/quantisation) distortion at some level and that level defines the noise level in our signal-to-noise ratio which is then assigned to the dynamic range of the system. This level is -150 dB from peak signal in float (regardless of how high or low that peak is, ie -300 dBFS or -1000 dBFS).

(And by the way, I can say with next to certainty that your 5100dB calculation is wrong; on the most basic level it could work for 1 bit mantissa, but you have a 25 meaningful bits in addition to the 8 bit exponent. I also think your calculation will work with a decimal floating point, but the floating point is binary - but I might be speaking rubbish. I did this calculation with Nika Aldrich something like 7 years ago, and if I remember correctly two different calculations yielded something around 1600 dB; I’ll have to dig in my email box to see if I still have it and what exactly was going on there.)

[quote]Ok, now lets turn to a real world signal being recorded. Even with the best acoustic environment, with the best mics, and the best converters you still have to deal with (unless you want to record at near absolute zero kelvin) Johnson Nyquist Noise, which is at the level of -124 dB at 100 kHz [3], which is around 21 bits, so already our single precision float is going great guns, not a problem thus far as you have a 3 or 4 bits of natural dither from actual thermal hiss. Our ears are also victims of physics, so the same thermal issues apply. In other words there is a limit to the dynamic range of human hearing that a single precision float also handles.

Now processing audio is a different matter, as I already pointed out, internally to an algorithm you may need to use double precision numbers, and if you really want to you could dither back to there (although it is non trivial) to single precision, but when passing information between devices single precision is fine.[/quote]

Yes the best Mic’s SNR is 110 dB, our ears are 120 dB dynamic range, the best ADC is 122-124 dB dynamic range… But whether you are using Pro Tools or Logic or any other system that uses float to represent samples, nearly every mathematical function on a sample value (let it be gain or mixing of samples) generates rounding distortion at -150 dB. And you are processing all over the shop if you’re mixing, for example.

So this is nothing to care about if all you do is apply gain as the resultant distortion will end up below the 24 or 16 integer bits the audio is going to end up at. But it is the accumulation of this distortion, and the distortion of a distortion of a distortion that can get into ‘the audible range’.

Now how exactly getting 32 bits, going double precision inside a plugin, then bit-reducing back to 32 bit is any better than having 64 bit throughout? When going from 64 to 32 you either get rounding distortion or dither noise, both will be around -150 dB. On a double precision system your noise floor is at -325 dB (54 bits) and that’s where your rounding distortion or dither will be added to.

I do completely agree that “when passing information between devices single precision is fine”, but it is as fine as a 44.1kHz 20 bit Integer audio - if no processing is being done, that’s all you’ll need. There is no point whatsoever using floats if all you do is passing information. So we really are talking about the processing side of float, not its passing qualities.

OK, I’m neither a great expert nor an authority on that, but to my knowledge the 64 bit Intel chips within our DAWs place float number into double registers anyway - there’s virtually no difference if you’re using float or double (only true for 64 bit machines). Am I wrong?

OK, I promise you I’m not going to do this. But from a pure digital audio point of view, it is more correct to go double precision and then dither back to 32 bit than to leave the rounding distortion you get from single precision signal path. I don’t think you should refer to people who do this as not smart enough, particularly as some of them work for respectable companies like NEVE.

Having said all that, I do see where you are coming from, and to be frank, I would probably give others the same advise as yours - just stick to single precision. I’m not even getting into the business of if you already going double precision you might just as well oversample in your plugin. Some people take the practical view (you can’t hear rounding distortion or low level aliasing, and a 3dB boost on an EQ will have much more effect on your audio), other takes the digital-audio-purist view (do whatever is needed to ensure the best audio quality).

But really, back to what this thread has started with - if VST provides a double precision signal path, why not support it? It just nulls our discussion altogether - you don’t have to go double and then single, you just get the double to begin with.

doubles are superior to floats during audio processing for all the reasons that 16-bits per color component image layers are superior to 8-bits per color component image layers.

For digital audio workstation-like software that applies a multitude of effects and filters to multiple channels of audio data, performing intermediate calculations using double-precision samples and converting down to single precision at the very end reduces distortion and quantization errors introduced in the audio pipeline.

While it is true that floats are probably all you need for one or two stereo streams with a small number of filters applied for real-time performance, having the ability to process audio data in double-precision floats is still helpful in a variety of cases.

I agree 100% with what TheVinn Said.

And now, just to clarify on the dynamic range of a 32bit floating point systems (I remembered it not being straightforward 7 years ago, and indeed it isn’t):

The dynamic range depends on whether or not one takes into account denormalized values. Given that with relation to audio denormalized values involve lower precision that brings up the rounding distortion level, we exclude them when talking about dynamic range.

As such, the dynamic range of a 32bit floating point (binary32/IEEE 754) can be calculated in two ways: either by using the exponent (which can shift the decimal point by 254 binary digits as 2 exponent values are reserved), so the calculation is: 254 * 6.02 = 1529. The second way is to calculate the difference between the highest and lowest normalized values, which gives 1529 as well.

To this you add 6dB of the sign bit to get 1535 dB of dynamic range.

Note to self: 1.0f or 0 dBFS Integer or 0 dBr is thus -770.6 dBFS float; you need 2^128 (340282366920938463463374607431768211456) signals at 0 dBr to clip a floating point system.

Izhaki: you’re getting dynamic range and precision mixed up.

TheVinn and Izhaki: you’ve both obviously got some bias on this topic, so lets reduce it back down to practical basics which can easily be put to the test. Why don’t you try the following - generate 1000 channels of random double precision numbers, scale each channel to a random value to simulate a 1000 channel mixer. Now use an all double precision signal path for one summation, but then go down to single at the end for final output. Then for the approach I recommend use only a double precision accumulator, but have both the input and output as single. Now calculate the difference between the two single precision final outputs, and report this error. Here is some psuedo code:

```
double ddsum = 0
double dfsum = 0
loop 1000 times to simulate a 1000 channel mix:
double drandval = double precision random number scaled by a fixed but random channel mixer fader amount
float frandval = (float) drandval
ddsum += drandval
dfsum += frandval
float fddsum = (float) ddsum
float fdfsum = (float) dfsum
double error = fabs (fddsum - fdfsum)
```

Now repeat this millions of times to simulate the length of a song in samples and report the maximum error you get, but divide it by the summed value so you get something relative to a maximum signal of 1. Most of the time the difference will be zero, the maximum I ever get is around 1.2e-7 (-138 dB), which is completely inaudible. Please go ahead and try any other signals you want, mix sins, and white noise and real recordings in there and try to get a bigger error, you can’t. Have more channels, have fewer channels, whatever, just try it out and let me know how you go.

if you’re doing things like REAKTOR Core modules then it makes sens to have 64bits inputs/ouputs signals

For a VST plugin nop.

Two questions:

- How 1.2e-7 translates to 138 dB?
- What do you compare 1.2e-7 to? Cause if you compare it 1.2e-38 (which is roughly the smallest change in sample level in float) that it is a massive error.

Thanks

1.2e-7 relative to 1, the normalised summation value. In my tests the sum came out as say 65.3, and the error 7.8e-6

A false assumption. I have never actually used double precision samples, all of my work is real-time / live performance. If you want to follow this KVR thread, it might prove to be enlightening:

TheVinn: Why are you commenting on something you have little knowledge of then? Also the thread you pointed out contains nothing of interest and nothing relevant that hasn’t already been discussed in more detail here already [* at the time of posting, now the thread has expanded beyond 5 posts, which was its length when this comment was made]. Perhaps you should direct those people here instead.

For feedback loops between modules you also only need single precision numbers, only the internal results of additions (accumulations) and subtractions, need to be double precision, the input and output of modules even with feedback can be single precision. For example you can realise a one pole low pass filter with the following code:

```
f32 in0, g;
f64 v1;
v1 += g * ((f64)in0 - v1);
f32 out0 = v1;
```

But you have just as good a final output by having a single precision “external” feedback loop instead of just using v1 internally:

```
f32 in0, in1, g;
f64 v1;
v1 += g * ((f64)in0 - (f64)in1);
f32 out0 = v1;
```

So in a modular environment you would plug out0 back into the negative input in1 of the module, so all the buffers being passed around are f32, but internally you compute with double where needed. Again, go ahead an try it out and calculate the error between the full f64 version and the one that passes audio between modules as f32, you can use g = 1 - exp (-2 * pi * cutoff / samplerate); and you may need to handle denormal values.

Andrew,

I just need to say something before I start. You say me and TheVinn are biased. I don’t think it is about being biased, I think it’s about what we believe is defending text book science, mathematical proofs and real life measurements (made by many great people over more than 60 years).

I cannot stress enough that if you are right in what you are saying than you should go out and make sure the digital audio community knows who you are, as you have just shown that the work and efforts on many many people was and is completely for no good reason. You should submit papers to AES explaining your arguments. And I promise you, that if you are right, you’ll have a place in the hall of fame of digital audio.

So I’m seriously interested in your discovery, but I can tell that you are very likely to not seeing the whole picture. I have a feeling you are convincing yourself that your findings are right, while overlooking some critical steps.

```
double ddsum = 0
double dfsum = 0
loop 1000 times to simulate a 1000 channel mix:
double drandval = double precision random number scaled by a fixed but random channel mixer fader amount
float frandval = (float) drandval
ddsum += drandval
dfsum += frandval
float fddsum = (float) ddsum
float fdfsum = (float) dfsum
double error = fabs (fddsum - fdfsum)
```

Could you please explain the last line? It is clear you are finding the level difference between the sum of the float summation and double made float summation.

But for starters, you should really have 20log10(fddsum/fdfsum) to find the dB level difference between them. As I mentioned earlier, 0 > 1 is 6 dB, but 60000 > 60001 is far less than that. If you do the find the dB difference between the measurements I’m sure you’ll be happy to see that these are extremely small, well below 1 dB (anyway, this should be the result).

My principal question is: what does the comparison between these two sums prove? That there’s tiny difference between them? If this is the case, than you probably know that on a specific sample a sine-wave and a square-wave has exactly the same sample level. On the same basis, I can show you identical or very close sample levels when comparing between Beethoven’s Violin Concerto and Pink Noise.

My big point is that I think you’re ignoring Fourier in simply comparing only the level of the two paths.

I have to make use of a key example in digital audio here. Could you please have a look at these graphs, and tell me if it is clear to you that the difference in sample value between each of the samples in the top and middle graphs is going to be very small? Never above -96dB? The last 8 of 24 bits? Never more than 0.000015258 if looked at from a float point of view (2^8/2^24)?

Now can you see how much damage this tiny level changes makes on the overall signal? Can you see 40dB boost of some harmonics? Can you imagine how the second (truncated) and third (dithered) signals will have a different THD measurement?

Am I pointing you the right direction?

And on floating point and dynamic range:

```
#include <iostream>
#include <math.h>
int main (int argc, const char * argv[])
{
std::cout.precision(32);
float iSample1 = 1.0f;
float iSample2 = 1.0f / pow(2, 32); // shift the floating point 32 binary places to the left.
float iSum = iSample1 + iSample2;
std::cout << "S1 : " << std::fixed << iSample1 << std::endl;
std::cout << "S2 : " << std::fixed << iSample2 << std::endl;
std::cout << "Sum: " << std::fixed << iSum << std::endl;
return 0;
}
```

Will output:

So does this clarify why the distance in dB between S1 and S2 cannot be considered as dynamic range in an audio system?

20*log10 (fddsum/fdfsum) = 1e-6 dB

Yes, this makes sense. Did you read the rest of my post?

Funny. If I do that it looks quite different (see attached images).

This is the Freemat/Matlab code I used, so maybe you can tell me if I did something wrong:

```
clear
close all
clc
SR = 44100;
quant24 = 2^24;
quant16 = 2^16;
t=(0:SR-1)/SR;
x = floor(quant24 * sin(2*pi*100*t))/quant24;
fx = fft(x);
fx = 20*log10(2*abs(fx(1:SR/2))/SR);
y = floor(quant16*x)/quant16;
fy = fft(y);
fy = 20*log10(2*abs(fy(1:SR/2))/SR);
f = (0:length(fx)-1);
f(1) = 1e-10;
figure;
semilogx(f,fx);
grid on;
title('100 Hz @ 24 Bit / 44100 kHz')
xlabel('Frequency [Hz]');
ylabel('Magnitude [dB]');
xlim([2 22e3]);
figure;
semilogx(f, fy);
title('100 Hz @ 16 Bit / 44100 kHz (truncated from 24 bit)')
xlabel('Frequency [Hz]');
ylabel('Magnitude [dB]');
grid on;
xlim([2 22e3]);
err16 = 20*log10(max(abs(x - y)));
disp(' ');
disp(['Max quantisation noise 16 bit: ' num2str(err16) ' dB']);
```

The code also outputs the maximum error which is surprisingly at about -96 dB.

Chris

Thing is though we are talking about floating point numbers not fixed point numbers, so even those more accurate graphs are irrelevant! You need to put a quantizer in the mantissa, which leads to a completely different result as precision in effect increases with a bit with each halving of amplitude as compared to fixed point.