Sign in to follow this  
SirGorthon

Polyphonic pitch recognition algorithm??

Recommended Posts

I've been dying to find an algorithm to convert a PCM input to MIDI. MIDI format isn't as important as simply knowing what notes are played. Can anyone help me?? Google gave me a GNU project ( https://gna.org/projects/fanr ), and I might try extracting the algorithm itself from that. If there's already a mathy approach available, though, I'm all ears. Thanks in advance!

Share this post


Link to post
Share on other sites
EDIT - revised the code

Alright, I think I'm getting closer.

I'm pulling heavily from: http://student.kuleuven.be/~m0216922/CG/fourier.html so forgive me if I'm making rookie mistakes here.

data[] is an int array, num is the length of data, hi and lofrequencies[] are double arrays containing acceptable frequency ranges for the 88 keys on a piano. The imaginary part of data[] is 0, so I simplified the code from the site a bit.

I only calculated the FTs for the frequencies I'm looking for (or at least, that's what I hope the code does), so that's what the for(w = ...) is all about.


public double[] getAmps(int[] data, int num)
{
double[] FRe = new double[num];
double[] FIm = new double[num];
double[] FAmp = new double[num];
double[] ampReal = new double[88];
for(int j = 0; j < 88; j++)
{
ampReal[j] = 0;
for(int w = (int)(sampleRate/hifrequencies[j] - 0.5); w < (int)(sampleRate/lofrequencies[j] + 0.5); w++)
{
FRe[w] = 0;
FIm[w] = 0;
for(int x = 0; x < num; x++)
{
double a = -2 * Math.PI * w * x / (double)(num);
double ca = Math.cos(a);
double sa = Math.sin(a);
FRe[w] += data[x] * ca;
FIm[w] += data[x] * sa;
}
FRe[w] /= num;
FIm[w] /= num;
FAmp[w] = Math.sqrt(FRe[w] * FRe[w] + FIm[w] * FIm[w]);
ampReal[j] += FAmp[w];
}
ampReal[j] /= ((int)(sampleRate/lofrequencies[j] + 0.5) - (int)(sampleRate/hifrequencies[j] - 0.5));
}

return ampReal;
}


[Edited by - SirGorthon on August 15, 2008 3:08:20 PM]

Share this post


Link to post
Share on other sites
I'm not sure that the Fourier Transform is the right tool for this job.

I think the Goertzel algorithm is probably the better choice:
http://en.wikipedia.org/wiki/Goertzel_algorithm

The real benefit to using the Fourier transform, when it is appropriate, is that you can do an O(nlogn) FFT instead of the naive O(n^2) algorithm. But if you only need a few frequencies -- say, 'm' of them -- then the naive method can again be faster because it's O(m n). So, clearly, there's a tradeoff. When m is small, the O(m n) algorithm is faster, but when it's large, you might as well increase m to n (i.e., compute all the frequencies), and then use the O(nlogn) algorithm.

The Goertzel algorithm improves slightly on the naive algorithm. It has the same asymptotic complexity, but has better constants.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this