Time based input to a NN

Started by
7 comments, last by orryx 16 years ago
Hi all, Just for fun I wanted to try to train my NN (back propagation) to recognize musical chords based on a wave file or such. The problem is how do I input the signal to the NN? Considering a sound wave is time based, and not to mention the signal could start at any amplitude. I'm really new to NNs, does the ordering of the input matter or is it just the relationship between the inputs that matters? IE: Is ABC considered the same input as BCA? If it is, this would solve the initial amplitude problem. And I suppose I could buffer up a small sample of the sound and input it all at once to solve the time issue. Is this accurate? Thanks.
Advertisement
Quote:Just for fun I wanted to try to train my NN (back propagation) to recognize musical chords based on a wave file or such. The problem is how do I input the signal to the NN? Considering a sound wave is time based, and not to mention the signal could start at any amplitude.
Pass it the signal in the frequency domain, normalized amplitude.

Google for fourier transforms.

Quote:I'm really new to NNs, does the ordering of the input matter or is it just the relationship between the inputs that matters? IE: Is ABC considered the same input as BCA? If it is, this would solve the initial amplitude problem. And I suppose I could buffer up a small sample of the sound and input it all at once to solve the time issue. Is this accurate?
Can't say I understand what you're talking about...
I was just wondering if the wave was shifted and input, would the network still recognize it?

I looked at Fourier transforms. Confusing, is one word I would use to describe them. I understand it translates the function from a time domain to a frequency domain, but I'm not sure how I can integrate from -infinity to +infinity on the computer. Is there a standard algorithm for fourier transforms anywhere? Also would the fact that the signal is the sum of multiple sine waves mess up the transform?
The fourier transform pretty much works because you can write ANY periodic function as the sum of sin and cos of various frequencies. (In some/many cases an infinite sum).

Look for source code to Discrete Fourier Transform or Fast Fourier Transform.

Also, if you intend to use a neural net on waveforms to do frequency identification, you'll have to figure out how to judge whether the output is correct or not. You could train it using FFT, but when the FFT is already accurate you might as well just use the FFT instead of the NN. It may be an interesting experiment to see if a neural net CAN learn how to perform an FFT, though.
Quote:Original post by orryx
I was just wondering if the wave was shifted and input, would the network still recognize it?


The output of a fourier transform contains two components (AKA real/imaginary or sin/cos components) for a frequency zone. When a certain frequency is shifted, the components will "shift" accordingly. I believe that if you treat those components like a 2D vector, the "length" (amplitude) of the frequency zone will stay the same.

I have no idea if an NN could learn this process.
I'd personally avoid a Fourier Transform and use a wavelet decomposition... but that's just me and not necessarily what I recommend for anyone else... it really depends on how comfortable you are with complex mathematics and fast numerical computing techniques.
Quote:I was just wondering if the wave was shifted and input, would the network still recognize it?
Thing is, you don't even need a neural network for this, just classical math.

Say you want to recognize a particular signal that may occur at any time during a five-minute recording (this is the shift) and is mixed in with other signals.

You'd want a finite impulse response filter for this, iirc (I'm not a telco engineer, sorry). The filter is the linear convolution of the signal you're looking for with the five-minute recording signal. The reason you take the fourier transform is because convolution in the time domain is equivalent to multiplication in the frequency domain, which is fast. Then you take the inverse transform of the result and voila, you have a signal whose amplitude means the similarity of the recording to your filter signal. You threshold this amplitude to decide whether your signal is strong enough to be recognizable or not.

I've glossed over a few things (linear convolution is not circular convolution, boundary conditions must be handled, the filter signal should be carefully built etc).

Quote:Confusing, is one word I would use to describe them. I understand it translates the function from a time domain to a frequency domain, but I'm not sure how I can integrate from -infinity to +infinity on the computer. Is there a standard algorithm for fourier transforms anywhere? Also would the fact that the signal is the sum of multiple sine waves mess up the transform?
The fourier transform's domain is the real line, hence the -infinity to +infinity. Obviously you can't do that on a computer and you're dealing with finite signals, so you generally you ensure periodic boundary conditions by multiplying by a gaussian if memory serves. This means that your signal is 'infinite' conceptually, but most of it outside the interval of interest is (asymptotically) zero so you don't care.

The sum of multiple sine waves does NOT mess up the transform (ideally); what messes it up is the boundary conditions and their treatment, and doing this correctly is tricky. An ideal, infinite signal of three sine waves added together would yield three points in the frequency domain.
I found a FFT algorithm (http://www.yov408.com/html/codespot.php?gg=36)

It requires both a real and imaginary input and produces a complex output. To be clear:

The real input corresponds to an array of amplitudes of the sample.

Imaginary input = ?

The real output corresponds to how well the given frequency matches the sample (IE: output[5] = how well 5 hz matches the signal)

Imaginary output = ?

This topic is closed to new replies.

Advertisement