Working with FFT Results.

Started by
17 comments, last by Zahlman 17 years, 4 months ago
Hey all, I'm currently doing a little bit of DSP work (very new to it!) and am doing a small project that colorizes a waveform by frequency of discrete intervals. To do this I am using a FFT on my sample sets to create a frequency spectrum. The problem I now come to, is analysing this spetrum. I want a way to either pick out a specific peak average frequency, or the peak frequency band. Any ideas?
Ollie"It is better to ask some of the questions than to know all the answers." ~ James Thurber[ mdxinfo | An iridescent tentacle | Game design patterns ]
Advertisement
It's been a while since I took my signal processing course. I do remember that I found that the best way to get a handle on Fourier transforms was to prototype my analyses in Matlab. The ability to quickly generate a visualization of the Fourier result really helped me get my brain around the meaning of the frequency domain, and how to get useful information out of the data.
Well, I don't know Matlab (should learn that one day), but I am outputting my results into a CSV file, and using Excel to draw graphs. Here is my input:


And here is the FFT output:


Notice all the bunching in the bass frequencies... I want some type of numerical way to represent this bunching, if possible.
Ollie"It is better to ask some of the questions than to know all the answers." ~ James Thurber[ mdxinfo | An iridescent tentacle | Game design patterns ]
I'm not sure I understand what you mean by 'colourise the waveform'. Do you know how you'd like to characterise the spectrum? As I see it, there are a few salient values, but they don't go hand-in-hand.

If you'd like an idea of the spectral range of the FT, you'd want to create an envelope around the log-spectrum and represent it as a distribution, probably as generalised normal bell-curve. You can then easily determine the mean and variance of the spectrum.

The results of this will give you no indication as to the key/tone of the sample; note detection requires an entirely separate approach. So you'd need to use logarithms and moduli to determine chords or formants.

Analysis of the dynamic range is very different again, but it still has an impact on one's intuitive idea of 'peak frequency'.

Which of these and what else, if anything, do you consider relevant? Perhaps you could say how you'd like to summarise the example image, so we can be more helpful.

Regards
Admiral
Ring3 Circus - Diary of a programmer, journal of a hacker.
when you do a real to complex FFT, you end up with the positive frequencies ( which is what you want ). the 0 frequency ( DC bias ) and the nyquist frequency do not have imaginary components so the are usually packed into the first 2 values in your array. check the documentaion of your fft to be sure. usually the nyquist frequency is placed at the end of the array past the last frequency and given a 0 imaginary component and then the imginary component for the 0 frequency is zeroed.

in the frequency domain the 2 values for each frequency are the real and imaginary fourier coeffcients. the amplitude for each frequency is created by:

amp = sqrt( real * real + imag * imag )

and the phase ( which is probably not important to you ) is :

atan2( imag, real );

i am assuming your are running a sliding window fft, using a certain number of samples, getting the spectrum and then moving up some samples and getting another spectrum. one thing you have to be careful of is to get enough samples in your fft to get an accurate measure of your frequency content.

the number of samples you use in your fft determines what the frequency is for each pair of fourier coeffcients.

this probably too much info, but it might give you some clues to what is involved in working in the frequency doamin.

Thanks for the input spurious_interupt, but I know most of what you've told me (amplitude is defined by the modulus of the complex number). I think the Nyquist frequency is packed into the 0th element of the array.


TheAdmiral, sorry to be unclear. What I'm trying to achive is the following. The user can load a wave form in, and I take chunks of this waveform, as I render it (say, 10000 samples) and analyze the frequency information, colouring the wave accordingly (like a thermometer, high is hot, mid is warm, bass is cold). As far as I know, FFT is the best of doing this. I understand it's probably not going to be as simple as that, because in real music you've got a LOT going on at once, but I'm curious to see how this works out.

I think I want what you talk about with log-spectrum and such, but that was all too quick for me... Mind expanding on that?


Thanks for the input so far all!
Ollie"It is better to ask some of the questions than to know all the answers." ~ James Thurber[ mdxinfo | An iridescent tentacle | Game design patterns ]
for music, certain frequencies can be played for a very short time. if you do an fft of 10000 samples, there will be very little localization in time. you can use the wavelet transform if you want to localize frequency better.
Spurious_interupt, yea I figured that, but my fft library (exocortex for .NET) seems to create the frequency scale from the amount of samples you feed it. Ie, give it 500 samples, the maximum frequency is 500hz. And those short notes, are quite often the high ones... I'll have a look at that wavelet transform you speak of.
Ollie"It is better to ask some of the questions than to know all the answers." ~ James Thurber[ mdxinfo | An iridescent tentacle | Game design patterns ]
the wavelet transform is not really simple and its not really one transfrom and it really can be done many ways.

if you have 1 second of data and have 500 samples ( 2ms sample rate ), nyquist is 250 hz and your delta frequency is 1 hz. if you keep the same sample rate and feed it 2 seconds of data you still have 0 - 250 hz but now your delta frequency is 0.5 hz. so more time gives you finer frequency resolution. for 500 real data samples input the fft will give you 251 complex samples.

so your frquency content is determined by your sample rate, not the number of samples.
Right,

I did some reading into wavelet transforms, and yea - they are complex. One site summarised it nicely, that FTs give frequency information only, with no time information. I don't really need time information, so -should- be cool with just fourier transformations.

In regards to the frequency content, here's my assumption. I store my data in an array, lets say of length 512. Then, I feed this array into my FFT function and it transforms each value. I just assumed because it's a transformation like this that it simply went up to 512hz. However, seeing as the data is sampled at 44khz, does that mean each value in my array is a step of 44000/512 ?

Really appreciate your help so far spurious_interrupt, so I gave you a alovely R++ :)
Ollie"It is better to ask some of the questions than to know all the answers." ~ James Thurber[ mdxinfo | An iridescent tentacle | Game design patterns ]

This topic is closed to new replies.

Advertisement