Implementing Pitch Alignment / Mapping like Pitchmap / Chroma

Hola peeps, I’ve been recently trying to break down how plugins like Zynaptiq’s Pitchmap and Xynth’s Chroma work for the last week.

I’ve managed to create my own implementation of a phase vocoder in Matlab while I was researching this for a pitch shifter, but I’ve kinda got stuck in the rabbit hole of pitch alignment/mapping/whatever, especially with regards to splitting up my input and shifting the relevant harmonics / frequencies of a signal without being too inefficient.

I’ve looked into using just an FFT and my pitch shifter but I’m not super sure if my (theoretical) method would work. Someone’s mentioned using a Constant Q Transform for this to me, but I’m not super sure about implementing it and how it might affect resolution/quality and efficiency. I’m hoping someone with a bit more DSP know how can help me break down this problem and understand the best way about implementing this. Thanks in advance!

Below is a little flowchart of my current idea for this implementation :slight_smile:

Chroma can’t be using FFT for its processes because there is a 0 latency mode and even the HQ mode has a low latency. Pitchmap however is surely a spectral process. You can clearly hear the preringing. I know about Chroma that it somewhere uses notch filters and then pitchshifters in the process and I suspect there are resonators on the Color parameter being > 50%. Also it is based on some Snapheap patch, so you might be lucky asking around in the khs community Discord for details

Ill have to have a look around those discords then, thanks for the little hint!

Hello, I hope you are doing well,
I am also trying to replicate it in pfft (on software max msp) and I am struggling to shift
I don’t know if it’s better to do it with pitch or frequency shifting (I have not much experience in fft pitch shift). I tried with freq but it’s not working well.
The first part of allowing bins to pass through if they are in key is working though.

Plz keep me updated if you find anything successful !

I haven’t looked into this in a while but looking back at it now, i think that splitting the signal up using a series of filters (low pass or band pass) to seperate frequency ranges and shift them specifically could work well. the resolution could be determined by the bandwidth of these filters (or the difference in cutoff freq if you use lowpass filters). This is somewhat like the Constant Q Transform thing i mentioned originally.

Im not sure how implementing that many filters would be in max but it sounds like a much more approachable way to do things since it would have much less latency (no fft), you could just pitch shift the filtered signals using whatever method you want and it might work pretty well…

If you manage to make anything cool make sure to share it!