Inverse FFT - how to synchronize the phase in each buffer block


#1

Is there any way to synchronise phase of output of inverse DFT in each buffer?
I mean: when I have input (to inverse DFT) source like vector: {s0=0 ,s1=0 , s2=0, s3=0 ..... s439=0. s440 = 1, s441=0.... zeros to the end}. Vector size is equal sample rate, and buffer size of FFT is also equal to sample rate. Then I should get simply sinusoid with freq 440 Hz.

And I get, but the end of the buffer is different than beginning of next buffer.

So when I send the output of inverse DFT to the speaker it sounds nasty with click on each buffer beginning.

Of course I know the „windowing functions” but it just gives me in/decreasing level on the begining and the end of buffer. It sounds better but with tremolo effect.

I’ve also heard about overlaping windowing, I haven’t tried that but I am pretty sure it also won’t give me clean signal. Probably it will give me some phasing/chorusing effect that will appear periodically on each begining and end of buffer. So I wonder how to repair that?

Could anyone help me? Is that possible at all to synchronise the end of the buffer with the beginning of next buffer?


#2

Actually, overlapping windows work well most of the time if you do not apply too much crazy modifications in the frequency domain. Just give it a try.

For further googling the terms overlap add (OLA) processing or short-time fourier transform (STFT) processing may lead you to better explanations. Sometimes this is also known as phase vocoder processing…


#3

You have to rotate the phase of that frequency bin to match the sinusoid’s phase at the end of the buffer.

The equation for your phase offset after one block (hopsize) should be the following
dPhi = frequency / sampleRate * hopsize * 2*pi


#4

Hello, thanks.
you cytated my befor thread :slight_smile:
But to the point. I suppose phase rotation could help, but not sure if it’s the solution. How to perform it? What I should multyply by dPhi = frequency / sampleRate * hopsize * 2*pi?
The input of my inverse FFT, or the output? Or Should I add it to something?
And will it work also for more complex signal? For example if I make real time forward FFT on some *.wave pop song, then cut some frequencies, and then back to inverse FFT?

I know for such things there are better to use some filters, but I am trying to do that just for fun and to understand how to use properly Fourier transform at all?

Actually also not sure what is “hopsize”?


#5

I need also try what seebk suggested.


#6

Have a look at phase vocoder processing, e.g. time stretching with a phase vocoder. this also requires adaption of the phase. BTW: Hop size is the amount of samples between two overlapping blocks.


#7

The right method actually depends on exactly what manipulation you are applying to your signal in the frequency domain. There is no one fits all answer.

If you are simply filtering out frequencies, then overlap add or overlap save techniques are the usual choice. This means if you want to work on a blocksize of e.g. 256 samples, then you usually append another 256 samples whith zeros (so called zero padding) and transform the resulting sample block with a size of 512 samples. Then you manipulate the 512 frequency bins, do the ifft and use the first 256 resulting samples as output samples. The other half is stored and will be added to the next block (which will again produce some samples to store and add in the following block and so on). This will result in a smooth result if you only filter frequencies.


#8

Thanks I need to thing about that.


#9

Also remember that you do not only need to overlap only two blocks. Depending on the desired quality you could use N overlaps of buffers shifted by 1/N of the FFT block size.


#10

This may also mean that you have a problem with what FFT does.
I hope you are not doing something like filtering in FFT space, but really just additive synthesis.


#11

Well, just to point this out, filtering in the frequency domain actually CAN be done and IS done, but of course doing stuff like just setting a single bin to zero to filter it out or anything like this will NOT produce any good result. However with carefully computed (short) impulse responses, a convolution in the frequency domain for applying filtering is absolutely possible and doesn’t required partitioned impulse responses as for an example used by the JUCE dsp::Convolution class. In this case of non-partitioned impulse responses, techniques with a single block of overlap as I described above are definetively one of the possible ways to go.

It would be quite helpful for a helpful answer if @pajczur would describe what exact processing he is applying. Until then we could just describe solutions for possible scenarios here.


#12

Any direct spectral filtering is bound to have artifacts. If you take n samples, you take the spectrum, you get n samples (not talking about zero-padding here, that’s another can of worms). If you multiply the spectrum with another array of size m, that means that you are doing a circular convolution in time, instead of doing a convolution of size n + m - 1. There is NO way to get around math.


#13

Zero padding was exactly what I mentioned above. If no zero padding is applied you’re absolutely right. I just didn’t want to confuse the thread author with words like circular and linear convolution so I gave a quick explanation on how it could be practically done.

Again I think a few pure DSP math tutorials that just talk about such things without actually implementing anything but explain the pure math behind this would be a great enhancement to the JUCE tutorials collection :wink:


#14

I think we need to speak these terms because LOTS of people are doing the wrong thing, even in tutorials. And that’s very bad.
With zero padding, you have an original signal of n samples, and then say you zero pad it to double the samples. You still get 2n samples in the spectrum. And then, you multiply it by something (which is also 2n). You should get 4n -1 samples, which you never get.

Then a FIR filter sone int he spectral domain, that’s something different, that’s just an implementation detail. It’s not filtering directly in the spectral domain, it’s a faster convolution, nothing more.

So there is a difference that has to be said, between designing a filter in the spectral domain and designing it int he time domain (FIR). The former DOESN’T work. Never. The latter of course works.


#15

OK, It’s my thread but I see there are a lot of talk behind me :slight_smile:
So I need to clean up some things:
All of what you are talking about is very interesting for me. But to be honest I am stupid (still) and most of those things I don’t understand. It’s a shame but I don’t even have an idea what is FIR, and other strange filters shortcuts. Of course I am going to learn all of that. But as a beginner I thought it’s good to first understand the most difficult process, and then all the rest like filter will be very easy to understand. And to be honest I’ve tried to understand Fourier transform from about year. I created my own implementation of DFT, FFT radix-2, FFT mixed radix with option to change the divider of radix. Even I created some strange FFT algorithm (that I think is my own invention, but probably I am wrong) that allows me to choose range of frequencies that I want to calculate, without need to calculate the freq out of range, but preparing (like reindexing) works so slow that, even though FFT is very fast, but changing frequency range is so slow that it’s impossible to change freq range in the real time.

OK but even I made all of that I am still even not sure what is the purpose to use Fourier transform :slight_smile: I think the only reasonable purpose is to make frequency analyser. Also I can imagine it could be handy in designing some filters curve, but I just imagine that, not sure how to do that. Maybe anyone could enlighten me, for what the hell we need all that FFT?

Ok, next thing what I am exactly do now with my FFT algorithms? Why I want to synchronise the phase between neighbour buffers? I feel stupid to say that, but only for testing my FFT algorithms, and YES I want to make real time filtering by sending signal to forward FFT, then set some freq bins to zero, and then back to inverse FFT. I know it will not give me good sounding results, but I just do it for testing, for fun. For example to see (hear) the difference between radix 2 FFT and mixed Radix.
For example if I set FFT buffer size equal to any standard sample rate, for example 44100Hz, then for making radix 2 FFT I need to cut some part of that buffer, or make it bigger and missed samples fill with zero padding. But with mixed radix FFT I can use all possible samples from 44100 so I can calculate perfectly each one freq bin, and it’s impressing for me, but it only looks nice on freq analyser graph. But I want to hear how zerro padding or cutting the buffer impact on the sound.
And finally what I want to do works, but I have problem with connecting each buffer block together. That’s why with FFT buff size 41000 I hear signal perfectly only in one second periods. But each buffer block not fit to the one before, so it’s disgusting. It happens also if I use mixed radix FFT which use and calculate each possible sample. Even I send to FFT simply sinusoid signal, there are some artifacts/cutting on the connections of each buffer.

Maybe somebody just could tell me something that would satisfy me with understading Fourier Transform that I could leave that topic and try to learn something new, like filters or other interesting things. My main purpose is just to learn programming and understanding audio process that I could send my CV to some nice job offer as a programmer in some around DSP industry because it’s something that I like very much.


#16

So there is a difference that has to be said, between designing a filter in the spectral domain and designing it int he time domain (FIR). The former DOESN’T work. Never.

Do you mean “implementing” rather than “designing?”


#17

Well, stop doing that, working in the spectrum domain directly is not something you can do if you don’t have the proper background. So read about signal processing, and then you will understand why setting bins to zeros just doesn’t work (otherwise, why would be struggle with filter design, really?)


#18

No, you can implement your filter in spectral domain with a convolution if you have an impulse that has m elements, you have a signal that has n, and if you do a Fourier transform with at least n + m -1, you don’t get aliasing. It was designed in time domain (because you have a finite impulse), but implemented in the spectral domain (which is perfectly fine as long as you know the time characteristics of your “impulse” or your spectral changes).


#19

“Stop doing that.” Thanks. Really, no ironic. Sometimes such simply words are good. But still maybe could yu tell me what is the purpose to use Fourier :slight_smile:


#20

Why? Sorry to say but that’s about the most useless thing you could have spent your time on. Implementing the FFT transform by itself is not going to be useful for any actual DSP work. There are already countless highly tuned implementations available that you could have just directly used. The FFT transform itself is not interesting at all, but rather what you can do with the data once you have it.