[Neural Networks] Multi-layer perceptron questions

Started by
9 comments, last by jollyjeffers 18 years, 3 months ago
Afternoon all, I'm having a whole bundle of laughs trying to get my head around some simple multi-layer neural networks and the back propagation algorithm. I've been reading through all available information and books for the last few days, yet I've still not quite understood everything. Seems that clean, straight-forward facts on the subject are not easy to find [headshake] 1) Threshold function In a single perceptron NNet you apply a threshold function (such as a sigmoid one) to the raw output of the inputs multiplied by the weights. In multi-layer perceptrons do you have to apply the same/similar threshold function to every intermediary perceptron in the network? My current code outputs a 3-component vector of floats, and I'm applying the threshold function to these. I'm wondering if I also have to apply a threshold to the results of the hidden perceptrons and pass this to the next layer.. 2) Comparing the output for learning purposes I still don't quite get the backpropagation algorithm, but it seems to be indicating that I have to use the computed error as a bias for teaching the previous (hidden) perceptrons in the network. Do I have to use the raw output or the corrected/thresholded value? I'm confused on this one as my thresholded values have very little granularity - which doesn't lend itself to small changes... 3) Anyone know of any good step-by-step examples? Yeah, gonna be a "n00b" here... my book is 99% theory and describes every single possible variant of a multi-layer network and the B.P. algorithm in a couple of pages that are just full of mathematical notation. this wiki page is closer to what I want, but there's not much detail. A google search just seems to yield university course notes that are just a distilled copy of the stuff in my book [rolleyes] A pseudo-code example or "applied" explanation of the algorithm is all I'm after - I just want to translate the theory->practice! So, anyone got any clues on the above? Cheers, Jack

<hr align="left" width="25%" />
Jack Hoxley <small>[</small><small> Forum FAQ | Revised FAQ | MVP Profile | Developer Journal ]</small>

Advertisement
Hello, you could take a gander at my journal on Neural Networks. You can see there is example C# code in there somewhere with a cmd screen.



Quote:Original post by jollyjeffers
1) Threshold function
In a single perceptron NNet you apply a threshold function (such as a sigmoid one) to the raw output of the inputs multiplied by the weights.
In multi-layer perceptrons do you have to apply the same/similar threshold function to every intermediary perceptron in the network?

You can use any activation function you wish for any node. However, to use backpropagation you would want that function to be differientiable. For instance, I could use sigmoid for hidden layer and linear for output layer (as my example). Or I could use sigmoid for all... and scale the outputs and inputs i use to train it and then "unscale" when used after training. The nice thing about bounded functions they don't give unbounded outputs. Also, by using the same type of node over and over backpropagation becomes simpler and the neural network is more modular (charge of $5 for this information) lol

Quote:2) Comparing the output for learning purposes
I still don't quite get the backpropagation algorithm, but it seems to be indicating that I have to use the computed error as a bias for teaching the previous (hidden) perceptrons in the network.

Do I have to use the raw output or the corrected/thresholded value? I'm confused on this one as my thresholded values have very little granularity - which doesn't lend itself to small changes...

What backpropagation is propagating errors back through the neural network. At the output layer you calculate the errors very simply (x-e). Then you must calculate the derivatives/errors for the nodes at the next layer and so forth backwards. To do this, you use the Chain Rule from Calculus.

Quote:3) Anyone know of any good step-by-step examples?
Yeah, gonna be a "n00b" here... my book is 99% theory and describes every single possible variant of a multi-layer network and the B.P. algorithm in a couple of pages that are just full of mathematical notation. this wiki page is closer to what I want, but there's not much detail
. A google search just seems to yield university course notes that are just a distilled copy of the stuff in my book [rolleyes]
A pseudo-code example or "applied" explanation of the algorithm is all I'm after - I just want to translate the theory->practice!
So, anyone got any clues on the above?
Cheers,
Jack

Hmm let me see, you can look over my code in my Journal. Or try this link:
Click Me
Try a small neural network by hand with all sigmoids.
Thanks for the information!

Apologies for the slow reply. Been a bit of a crazy 24hrs (including a complete internet blackout [oh])..

Thanks for the links/references - I had a read through them all [smile]

In the end, I found this page to be useful. It's quite simple and has lots of pictures - which is what I needed to try and make sense of all the theory/equations I've got written down [lol]

Quote:by using the same type of node over and over backpropagation becomes simpler and the neural network is more modular
Yeah, I'm realising that would have been a good idea. I've managed to write some horrific procedural-style C++ for my implementation. Lots of jumping about with offsets and mappings in arrays. Lovely.

Quote:To do this, you use the Chain Rule from Calculus.

The depressing part is I used to know (and understand) this. It's been too many years since I was taught it now [headshake]

Quote:(charge of $5 for this information) lol

I have about $4 in coins left over from my trip to Seattle if you want that... or you can just have some more ratings points [smile]


Thanks,
Jack

<hr align="left" width="25%" />
Jack Hoxley <small>[</small><small> Forum FAQ | Revised FAQ | MVP Profile | Developer Journal ]</small>

Quote:Original post by jollyjeffers
Quote:To do this, you use the Chain Rule from Calculus.

The depressing part is I used to know (and understand) this. It's been too many years since I was taught it now [headshake]
I'm sure you can reaquaint yourself with this quite easily - it's a pretty simple bit of maths, although I too cannot remember it off the top of my head!

You NN guys, is it possible to implement and train a NN without all the heavy-duty mathematical stuff? I've read lots of things about the basic idea of NNs but all of them say something like "to understand back-propagation is too matematically complex for this article". I'm happy enough with maths (compared to uni it can't be too bad) but I'm wondering if I must find a uni/academic-style book on NNs to stand a chance or if my "AI gems for game programming" type book will have enough to get started?

Quote:Original post by d000hg
You NN guys, is it possible to implement and train a NN without all the heavy-duty mathematical stuff?

Yes, it is - it's what I've just done [lol]

I've pretty much gone with the "monkey see, monkey do" approach - I'm pretty sure I know how my implementation translates to the theory in my textbook, but there are a couple of assumptions that I've picked up that don't necessarily make sense.

Quote:Original post by d000hg
I've read lots of things about the basic idea of NNs but all of them say something like "to understand back-propagation is too matematically complex for this article".

Doesn't surprise me - most of the BP explanations I've come across are pretty hardcore. A simple "X then Y then Z" article is fairly hard to find (hence my original post).

Quote:Original post by d000hg
I'm happy enough with maths (compared to uni it can't be too bad) but I'm wondering if I must find a uni/academic-style book on NNs to stand a chance or if my "AI gems for game programming" type book will have enough to get started?

If it's of any interest, the module I'm sitting has Neural Networks - A Comprehensive Foundation (Second Edition) by Simon Haykin as it's only required text. It's not easy reading, but it definitely seems to cover everything in a lot of detail...

hth
Jack

<hr align="left" width="25%" />
Jack Hoxley <small>[</small><small> Forum FAQ | Revised FAQ | MVP Profile | Developer Journal ]</small>

One other method would be to use a Genetic Algorithm (which doesn't require derivatives). Use weights as chromosomes and error at the output as fitness.
Then go outside and dance.. dance.. dance.. until it rains. Then come inside and Volia! instant weights (though may be local minimum).
Quote:Original post by NickGeorgia
One other method would be to use a Genetic Algorithm (which doesn't require derivatives).

I've got to use a multilayer neural network for this job - don't get any choice in the matter [smile]

Quote:Original post by NickGeorgia
Then go outside and dance.. dance.. dance.. until it rains. Then come inside and Volia! instant weights (though may be local minimum).

I've not studied GA's in much detail, but I wasn't aware that dancing had much to do with it. I guess you learn something every day.

Cheers,
Jack

<hr align="left" width="25%" />
Jack Hoxley <small>[</small><small> Forum FAQ | Revised FAQ | MVP Profile | Developer Journal ]</small>

Quote:Original post by jollyjeffers
Quote:Original post by NickGeorgia
One other method would be to use a Genetic Algorithm (which doesn't require derivatives).

I've got to use a multilayer neural network for this job - don't get any choice in the matter [smile]

Quote:Original post by NickGeorgia
Then go outside and dance.. dance.. dance.. until it rains. Then come inside and Volia! instant weights (though may be local minimum).

I've not studied GA's in much detail, but I wasn't aware that dancing had much to do with it. I guess you learn something every day.

Cheers,
Jack


GA's will work on multilayer neural network. It's just an optimization technique to find the best weights.
That was me up there...
So Jack, when you say you're following a "monkey see..." approach that makes me think that all the complexity of NNs leads to some key equations which you can use without needing to understand the derivation? So a simple tutorial can explain it, but without explaining any of the detail, just the final result to plug in and use?

I'm looking forward to my book arriving - "AI Game Programming Wisdom" although I'll probably just end up using F(s)SM like every other game!

This topic is closed to new replies.

Advertisement