Jump to content
  • Advertisement
Sign in to follow this  
orryx

Time based input to a NN

This topic is 3788 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi all, Just for fun I wanted to try to train my NN (back propagation) to recognize musical chords based on a wave file or such. The problem is how do I input the signal to the NN? Considering a sound wave is time based, and not to mention the signal could start at any amplitude. I'm really new to NNs, does the ordering of the input matter or is it just the relationship between the inputs that matters? IE: Is ABC considered the same input as BCA? If it is, this would solve the initial amplitude problem. And I suppose I could buffer up a small sample of the sound and input it all at once to solve the time issue. Is this accurate? Thanks.

Share this post


Link to post
Share on other sites
Advertisement
Quote:
Just for fun I wanted to try to train my NN (back propagation) to recognize musical chords based on a wave file or such. The problem is how do I input the signal to the NN? Considering a sound wave is time based, and not to mention the signal could start at any amplitude.
Pass it the signal in the frequency domain, normalized amplitude.

Google for fourier transforms.

Quote:
I'm really new to NNs, does the ordering of the input matter or is it just the relationship between the inputs that matters? IE: Is ABC considered the same input as BCA? If it is, this would solve the initial amplitude problem. And I suppose I could buffer up a small sample of the sound and input it all at once to solve the time issue. Is this accurate?
Can't say I understand what you're talking about...

Share this post


Link to post
Share on other sites
I was just wondering if the wave was shifted and input, would the network still recognize it?

I looked at Fourier transforms. Confusing, is one word I would use to describe them. I understand it translates the function from a time domain to a frequency domain, but I'm not sure how I can integrate from -infinity to +infinity on the computer. Is there a standard algorithm for fourier transforms anywhere? Also would the fact that the signal is the sum of multiple sine waves mess up the transform?

Share this post


Link to post
Share on other sites
The fourier transform pretty much works because you can write ANY periodic function as the sum of sin and cos of various frequencies. (In some/many cases an infinite sum).

Share this post


Link to post
Share on other sites
Look for source code to Discrete Fourier Transform or Fast Fourier Transform.

Also, if you intend to use a neural net on waveforms to do frequency identification, you'll have to figure out how to judge whether the output is correct or not. You could train it using FFT, but when the FFT is already accurate you might as well just use the FFT instead of the NN. It may be an interesting experiment to see if a neural net CAN learn how to perform an FFT, though.

Share this post


Link to post
Share on other sites
Quote:
Original post by orryx
I was just wondering if the wave was shifted and input, would the network still recognize it?


The output of a fourier transform contains two components (AKA real/imaginary or sin/cos components) for a frequency zone. When a certain frequency is shifted, the components will "shift" accordingly. I believe that if you treat those components like a 2D vector, the "length" (amplitude) of the frequency zone will stay the same.

I have no idea if an NN could learn this process.

Share this post


Link to post
Share on other sites
I'd personally avoid a Fourier Transform and use a wavelet decomposition... but that's just me and not necessarily what I recommend for anyone else... it really depends on how comfortable you are with complex mathematics and fast numerical computing techniques.

Share this post


Link to post
Share on other sites
Quote:
I was just wondering if the wave was shifted and input, would the network still recognize it?
Thing is, you don't even need a neural network for this, just classical math.

Say you want to recognize a particular signal that may occur at any time during a five-minute recording (this is the shift) and is mixed in with other signals.

You'd want a finite impulse response filter for this, iirc (I'm not a telco engineer, sorry). The filter is the linear convolution of the signal you're looking for with the five-minute recording signal. The reason you take the fourier transform is because convolution in the time domain is equivalent to multiplication in the frequency domain, which is fast. Then you take the inverse transform of the result and voila, you have a signal whose amplitude means the similarity of the recording to your filter signal. You threshold this amplitude to decide whether your signal is strong enough to be recognizable or not.

I've glossed over a few things (linear convolution is not circular convolution, boundary conditions must be handled, the filter signal should be carefully built etc).

Quote:
Confusing, is one word I would use to describe them. I understand it translates the function from a time domain to a frequency domain, but I'm not sure how I can integrate from -infinity to +infinity on the computer. Is there a standard algorithm for fourier transforms anywhere? Also would the fact that the signal is the sum of multiple sine waves mess up the transform?
The fourier transform's domain is the real line, hence the -infinity to +infinity. Obviously you can't do that on a computer and you're dealing with finite signals, so you generally you ensure periodic boundary conditions by multiplying by a gaussian if memory serves. This means that your signal is 'infinite' conceptually, but most of it outside the interval of interest is (asymptotically) zero so you don't care.

The sum of multiple sine waves does NOT mess up the transform (ideally); what messes it up is the boundary conditions and their treatment, and doing this correctly is tricky. An ideal, infinite signal of three sine waves added together would yield three points in the frequency domain.

Share this post


Link to post
Share on other sites
I found a FFT algorithm (http://www.yov408.com/html/codespot.php?gg=36)

It requires both a real and imaginary input and produces a complex output. To be clear:

The real input corresponds to an array of amplitudes of the sample.

Imaginary input = ?

The real output corresponds to how well the given frequency matches the sample (IE: output[5] = how well 5 hz matches the signal)

Imaginary output = ?

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!