RMS detection issue - Will Pirkle - Designing Audio Effect Plug-Ins in C++

is there anybody who studied book in topic?
I found there some piece of code which doesn’t make a sense for me. Could anyone help?

There is method to detect envelope, something like that:

float CEnvelopeDetector::detect(float fInput)
	case 0: // PEAK mode
		fInput = fabs(fInput);
	case 1: // MS mode
		fInput = fabs(fInput) * fabs(fInput);
	case 2: // RMS mode
		fInput = pow((float)fabs(fInput) * (float)fabs(fInput), (float)0.5);
		fInput = (float)fabs(fInput);
// ... some more code

And I can’t understand the RMS mode. It’s actually something like (x*x)^0.5, which is exactly the same as just x. Am I right or not?
So what is the difference (in that method) between PEAK and RMS modes?

For any help thanks in advance.

The RMS for a single sample is meaningless, or like you found out, it is the same as the magnitude at that given point.

RMS is defined for a series of samples as
rms = sqrt (( 1 / num) * sum (x[i] * x[i])) | i = 0..num

See Wikipedia: Root mean square

That’s right, exactly. That’s why I can’t understand that code. But that book is looks like serious book. And I found out that code is used in some projects on github, that’s why I thought there is some sense but I can’t find it

If that snippet you posted is from the book, I would share the reservations I have read here on the forum towards that book. But I haven’t read it, so I shouldn’t allow myself to judge.

Juce has a built in function to compute the RMS value from an audio buffer, called AudioBuffer::getRMSLevel().
But you will probably have to add more than one together, to get a smooth value.

Be aware, that if you collect multiple RMS values, you do the same, not linearly averaging, but rather sum the squares, and get the square root after normalising to the number of individual RMS sums.


Hello, thanks Daniel,
I try to program my own limiter ( I am experimenting ) .
But I always thought the limiter works on peak detection, with 0ms attack, 0ms release, and hard knee. But I found out for those settings I get really unpleasant sound.
So I wonder how to fix it. And I think that maybe I shouldn’t use really “hard” peak detection (separate for each one sample) but maybe I would get better results if I use root mean square for some more samples? Or maybe should I introduce some soft knee, or longer attack time.

I compare my results with other plugins like Waves L1, or Fabfilter Pro-L. I am really fascinated how they made those plugins. How to achieve some similar effect.

Problem with limiter with 0ms attack, 0ms release and hard knee is that they basically square the signal (with retaining the sign), which will act as a distortion effect, very very unpleasant :slight_smile: on the other hand when setting the attack to higher values, fast peaks will find there way through the effect and hence the limiter doesn‘t do its job.

It helps to implement a look-ahead and fade-in the gain reduction. The output signal will be delayed but there won‘t be any distortion (the release can be set to a normal value). However, look-ahead is not trivial to implement, and 90% of the internet will tell you to simply delay your analysis signal, which is not really look-ahead, as peaks will still come through.

here’s a nice presentation about dynamic processing in general, which I found very helpful http://c4dm.eecs.qmul.ac.uk/audioengineering/compressors/documents/Reiss-Tutorialondynamicrangecompression.pdf
Edit: I followed the magnitude approach when calculating the analysis signal (envelop), so no RMS as ballistics will already care about the time behavior.

Great thanks I will study it for sure.
But back to your suggestion about look-ahead. So as you told the signal will be delayed (can I call it latency?). But as I experimented with mentioned plugins (like Waves L1) I don’t recognise there any latency. So I think they achieved that effect in some other way. But maybe I don’t understanding it?

What do you mean by you don’t recognize any latency? Did you measure it or try to make it audible through live audio in some way?
According to this chart the Waves L Plugins definitively introduce some Latency

  • L1: 64 Samples
  • L2: 64 Samples
  • L3: 3840 Samples (!)

However a latency of 64 samples might not really be noticeable even in context of live audio signals if all other latencies in your signal path are set quite low

I also chose 5ms for my look-ahead limiter, it‘s enough to get rid of the distortion artifacts. And even if you don‘t report the latency to the DAW it‘s not that noticeable

OK, I just measured Waves L1. It introduces latency. I was wrong. Sorry I was sure it works without latency.

OK, but now the question is how many samples should I use for RMS detection, let’s say for compressor?
Should I use whole buffer, or maybe just few samples, or whole second?

I can take just 2 samples and calculate root mean square, but I don’t think I can call it RMS level. So how many samples makes detection RMS?

And in other hand. PEAK detection is always for just one sample? Or maybe little bit more?

I don’t know what other people do in this circumstance, but personally, I would use much more than a few samples. I would use at least 0.1 seconds because a waveform has a quiet sections around the zero crossing. And these will cause your RMS detection to fluctuate wildly.

Maybe that helps a little:

I started with 100ms when creating a compressor, however once you have a fixed detection window length, the attack and release times for ballistics (compressor’s time behavior) is off. That means that a setting of 0ms isn’t zero anymore but something between 0 and 100ms (depending on the rms calculation technique e.g. recursive rms).
As I already said above, I use samplewise magnitude calculation (0ms analysis) and calculate ballistics in the decibels domain.

All of those decisions will make your compressor sound and behave differently, that’s why there are so many compressors out there! Just try different values and use the one you think sounds best! You should just be aware that a brick wall limiter needs 0ms rms, otherwise peaks will slip through

The book is bad, and what this snippet shows that the author doesn’t even know the C math library, let alone the C++ one.

Change book, get another one, burn this one.

…the author doesn’t even know the C math library, let alone the C++ one

Hey, thanks for answer, but what’s wrong with that snippet from C/C++ math lib point of view?

I am asking for myself education, because I’m quite new in programming and for me that code looks quite normal (from C++ point of view) besides those doubts which I described on the beginning.

Do you mean Hungarian formatting? Or that cases shouldn’t be just numbers but some defined values? Or what?

I believe @Matthieu_Brucher is talking about this


Which is used for the absolute value of a double in C. It should be


Which is the appropriate function (in C). In modern C++ you’d be more likely to see this:

#include <cmath> // not math.h

/// .... 
std::abs (input); // same as fabsf(), but has template overloads for all numeric types

I strongly disagree with his opinion of “burn it and buy another” since that book is pretty great as an Intro to Applied Audio DSP and not a C++ textbook (which it tells you, in the preface). The only other book out there that comes close to it is Zolzer’s Audio Signal Processing, and the text he edited called Digital Audio Efffects (DAFX). And those require a bit more of a math background than the Pirkle texts.

TL;DR the book is fine - the code isn’t.

1 Like

I’m also talking about

pow((float)fabs(fInput) * (float)fabs(fInput), (float)0.5);

Not knowing that sqrt is faster than pow… Or if he knows and he’s not giving the information, it’s even worse.

Especially when fabs(finput) is even faster for that equation :slight_smile:

Wow, that code really is atrocious. No wonder beginners struggle with C++ if that’s the kind of example they’re learning from…

Here’s how we’d have written it :slight_smile:

float CEnvelopeDetector::detect(float level)
    auto absLevel = std::abs (level);

    if (mode == Mode::peak)    return absLevel;
    if (mode == Mode::midSide) return absLevel * absLevel;
    if (mode == Mode::RMS)     return std::sqrt (absLevel * absLevel);

    return level;
1 Like

Hello, great thanks for all your support. But still I can’t see the logic difference between this:
return std::sqrt (absLevel * absLevel);
and this:
return absLevel;

Maybe the difference would be if we consider it due to float accuracy. But I don’t think it’s the case.