Polyphonic pitch recognition algorithm??

Started by
4 comments, last by Emergent 15 years, 8 months ago
I've been dying to find an algorithm to convert a PCM input to MIDI. MIDI format isn't as important as simply knowing what notes are played. Can anyone help me?? Google gave me a GNU project ( https://gna.org/projects/fanr ), and I might try extracting the algorithm itself from that. If there's already a mathy approach available, though, I'm all ears. Thanks in advance!
Advertisement
fourier transform
Pitch detection algorithm has a decent survey of techniques.
"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk
Wowee! Fourier transforms seem to hit the spot perfectly! Thanks bunches! I'll also look into the PDA. I really appreciate ya'lls' help!
EDIT - revised the code

Alright, I think I'm getting closer.

I'm pulling heavily from: http://student.kuleuven.be/~m0216922/CG/fourier.html so forgive me if I'm making rookie mistakes here.

data[] is an int array, num is the length of data, hi and lofrequencies[] are double arrays containing acceptable frequency ranges for the 88 keys on a piano. The imaginary part of data[] is 0, so I simplified the code from the site a bit.

I only calculated the FTs for the frequencies I'm looking for (or at least, that's what I hope the code does), so that's what the for(w = ...) is all about.

	public double[] getAmps(int[] data, int num)	{		double[] FRe = new double[num];		double[] FIm = new double[num];		double[] FAmp = new double[num];		double[] ampReal = new double[88];		for(int j = 0; j < 88; j++)		{			ampReal[j] = 0;			for(int w = (int)(sampleRate/hifrequencies[j] - 0.5); w < (int)(sampleRate/lofrequencies[j] + 0.5); w++)			{				FRe[w] = 0;				FIm[w] = 0;				for(int x = 0; x < num; x++)				{					double a = -2 * Math.PI * w * x / (double)(num);					double ca = Math.cos(a);					double sa = Math.sin(a);					FRe[w] += data[x] * ca;					FIm[w] += data[x] * sa;				}				FRe[w] /= num;				FIm[w] /= num;				FAmp[w] = Math.sqrt(FRe[w] * FRe[w] + FIm[w] * FIm[w]);				ampReal[j] += FAmp[w];			}			ampReal[j] /= ((int)(sampleRate/lofrequencies[j] + 0.5) - (int)(sampleRate/hifrequencies[j] - 0.5));		}				return ampReal;	}


[Edited by - SirGorthon on August 15, 2008 3:08:20 PM]
I'm not sure that the Fourier Transform is the right tool for this job.

I think the Goertzel algorithm is probably the better choice:
http://en.wikipedia.org/wiki/Goertzel_algorithm

The real benefit to using the Fourier transform, when it is appropriate, is that you can do an O(nlogn) FFT instead of the naive O(n^2) algorithm. But if you only need a few frequencies -- say, 'm' of them -- then the naive method can again be faster because it's O(m n). So, clearly, there's a tradeoff. When m is small, the O(m n) algorithm is faster, but when it's large, you might as well increase m to n (i.e., compute all the frequencies), and then use the O(nlogn) algorithm.

The Goertzel algorithm improves slightly on the naive algorithm. It has the same asymptotic complexity, but has better constants.

This topic is closed to new replies.

Advertisement