Sign in to follow this  
tobinare

Understanding Neural Net Programs

Recommended Posts

I'm trying to learn about basic Neural Networks and programming for them (C++). I've downloaded several source codes from tutorials etc. and libraries. They all seem to work the same by creating the network with layers and neurons and apply the backpropagtion in some manner. I've compiled several programs and run them easily enough. As a metric to compare their functionality and ease of use I've tried to apply the typical XOR problem to each of them. It seems half of the programs handle this problem well by learning and solving for 1 and 0. The other half seem to solve for 0.5 which is the case for a Multilayer Perceptron code I have. What's the difference between different algorithms in NN's approach to this problem? I read that the XOR problem cannot be solved by MLP but why does another progam solve it easily? I've tried reading several tutorials but I get lost in the nomenclature. I hope someone can shed some light on Backpropagation, FeedForwad, etc.

Share this post


Link to post
Share on other sites
It's actually possible for a Multilayer Perceptron to solve the XOR problem. The XOR problem comes from the fact that a Single-Layer Perceptron (SLP) is only a linear separator, which means that it can only separate a space in 2 half-spaces. With that in mind, it's easy to see that it's impossible to find a line (in a 2D case) that would correctly classify outputs from the XOR function:

http://neuralpad.org/images/xor.jpg (the 2 axes are the inputs from the XOR function, white and black dots are the possible outputs, respectively 0 and 1)

By adding a second layer to the SLP, it becomes possible to separate the outputs correctly.

I'm not familiar with any great resources for practical neural networks implementation, but getting your hands on the book Artificial Intelligence: A Modern Approach (a library would be preferable considering the fact that you will only be interested in a couple of chapters) might be a good idea if you wan't to learn about the nomenclature (seeing that the book is more theoretical and academical than practical).

Share this post


Link to post
Share on other sites
You might check out this book:

http://www.amazon.com/Techniques-Programming-Premier-Press-Development/dp/193184108X/ref=pd_sim_b_7

it goes into a lot of practical neural net stuff for video games.

Instead of back propogation, it relies on genetic algorithms to train networks which is simpler IMO.

(:

If you are looking more for an academic reference, this book is a few years old but is an AMAZING resource. It talks a lot about various networks and how they work and different functions being used for adjusting learning weights. It also covers BAM (bidirectional associative memory) which doesn't need to be trained. And i guess as a bonus it has a lot of info on fuzzy logic as well.

http://www.amazon.com/Neural-Networks-Fuzzy-Logic-Book/dp/155828298X/ref=sr_1_1?ie=UTF8&s=books&qid=1265676968&sr=1-1

Share this post


Link to post
Share on other sites
Nothing, really. "Multilayer Perceptron Network" describes a NN with at least one hidden layer, regardless of training algorithm, though backpropagation is by far the most common training algorithm. "Feedforward Backpropagation" describes a gradient-descent-based NN training algorithm, which is most often used to train multilayer NNs.

Oh, and as long as I'm posting in an NN thread, I should really point out that the primary use of NNs is by people who don't know the range of machine learning techniques available to them, don't understand the situations in which NNs are inferior to other machine learning techniques, or incorrectly assume that because NNs have "neural" in the name, they are intrinsically more capable of emulating or rivaling human decision-making processes than other machine learning techniques.

Share this post


Link to post
Share on other sites
Quote:
Original post by Sneftel
Oh, and as long as I'm posting in an NN thread, I should really point out that the primary use of NNs is by people who don't know the range of machine learning techniques available to them, don't understand the situations in which NNs are inferior to other machine learning techniques, or incorrectly assume that because NNs have "neural" in the name, they are intrinsically more capable of emulating or rivaling human decision-making processes than other machine learning techniques.

But that's what The Terminator uses, so it's cool.

Share this post


Link to post
Share on other sites
Quote:
Original post by Sneftel
Oh, and as long as I'm posting in an NN thread, I should really point out that the primary use of NNs is by people who don't know the range of machine learning techniques available to them, don't understand the situations in which NNs are inferior to other machine learning techniques, or incorrectly assume that because NNs have "neural" in the name, they are intrinsically more capable of emulating or rivaling human decision-making processes than other machine learning techniques.


Yeah, that seems to be the standard reply these days.

I'm no huge fan of NNs and I've been told it's but one of many classifiers. For many courses though NNs seem to be the only classifier that is taught, which would be a nicer explanation for their misguided popularity than willful ignorance. I'd like to think I somewhat understand the limitations of NNs and to me they still seem like a useful tool to model an implicit function suspected to govern a set of data. That would be a sane use case, no? [smile]

Share this post


Link to post
Share on other sites
NN's can work and can work quite well. You can use them for other things like time series prediction as well(I have). The problems with NN's is there is no agreed upon good way to set up a problem. I referenced and excellent paper in one of my papers, but forget its title??? A basic abstract of the paper is that many researchers claim they know the perfect number of input nodes/hidden nodes/output nodes or ratio of nodes but a lot of the findings contradict each other. It is really hard to know if you are using the right number of nodes and if you can do much better with a different combination. Change one number and because of their nature and they are a data driven technique the results can be quite different. I like NNs but they are far from perfect.

Share this post


Link to post
Share on other sites
Quote:
Original post by Sneftel
Oh, and as long as I'm posting in an NN thread, I should really point out that the primary use of NNs is by people who don't know the range of machine learning techniques available to them, don't understand the situations in which NNs are inferior to other machine learning techniques, or incorrectly assume that because NNs have "neural" in the name, they are intrinsically more capable of emulating or rivaling human decision-making processes than other machine learning techniques.


Very true, and well said.

Share this post


Link to post
Share on other sites
Quote:
Original post by remigius
I'd like to think I somewhat understand the limitations of NNs and to me they still seem like a useful tool to model an implicit function suspected to govern a set of data. That would be a sane use case, no? [smile]

It's sane, but not necessarily optimal. There are many other ways of modelling and approximating functions, many of which will be more suitable than neural networks, especially if you know a little about the characteristics of the function.

Share this post


Link to post
Share on other sites
Sorry about the thread hijack, but I'd really like to hear some alternatives to NNs for my scenario

Quote:
Original post by Kylotan
It's sane, but not necessarily optimal. There are many other ways of modelling and approximating functions, many of which will be more suitable than neural networks, especially if you know a little about the characteristics of the function.


That's the thing with my particular scenario, I don't know anything about the characteristics of the function or if there even is one. What we have is a set of data where a bunch of inputs are suspected to have some relation to a bunch of outputs. What we've been trying to do is model this relation by training various NNs with our sample data (with different subsets of the inputs/outputs, as requested by the field experts) and doing some sensitivity testing on this 'model' to get a very rough idea of how the inputs might affect the outputs and to gather some evidence of these suspected relations.

The function/relation is not expected to be linear nor even continuous and we're looking into about 10-80 inputs and 1-10 outputs, with some of the inputs being mutually dependent. I'm thoroughly aware of the downsides of this approach and I have often wondered about alternatives, but the uncertainty and variations in our scenario seem to be a perfect match for the vagueness of NNs.

I'm genuinly curious what aproach you'd recommend in a scenario like this, other than axe the project and run! [smile]

Share this post


Link to post
Share on other sites
80 dimensional space! If you suspect it to be nonlinear then look into nonlinear techniques. I would try to identify this first. NN are not the only ones for nonlinear. How about nearest neighbor approaches? There is an excellent book and software package for nonlinear dynamical analysis. The software is called TISEAN. The book should be a good guide about how to set up a solution. It is actually quite readable unlike most books on complex topics.

Share this post


Link to post
Share on other sites
When all I have is a bunch of data points and a generalized function to learn which seems strongly non-linear, my first instinct is k-NN, which is simple to implement and produces great results as long as your data covers the domain of the function well and doesn't have a bunch of outliers (and you can wrangle up a decent distance metric). If I know there's a lot of dependencies in the data, and the dimensionality is making the problem difficult to handle, I might PCA it first, though in my opinion PCA is a rather blunt instrument. Finally, if there really are serious discontinuities, I might be inclined to cluster first with k-means or SVM, then learn each cluster separately. A more expert ML guy might jump to more exotic and robust methods than these right off the bat, but these are simple to understand, simple to implement, and simple to tweak without a bunch of information-theoretic wisdom.

Or you could just toss Weka at it.

Share this post


Link to post
Share on other sites
Essentially, NN are a bunch of algorithms that try to learn patterns with 2 main objectives: classification and function emulation.

Classification means that you have some data (points, secuences of DNA, whathever) and you whant to distinguish between them in some set of "classes". You can "teach" then your NN to classify by showing it some examples (this is in this class, this goes to another) or showing a bunch of points and letting it decide the "best" classification.

Oh, you have to set the number of classes, this can be a problem in some cases.

Other thing NN do pretty well is to emulate a function. You have a complicated function wich you want to "predict" some points, but you don's have its matemathic description, NN come to you.

Backpropagation, FeedForwad, SOM, etc... ara names that set the architecture of the net (Backpropagation has no loops, while feedforward has) or the learning that he does (both backpropagation and feedforward do suppervised learning, while SOM goes his way)

Share this post


Link to post
Share on other sites
Strictly speaking, backpropagation is short for "backpropagation of errors", which is a general process of sending errors measured at the neural network output back through the successive layers, but most often is used to refer specifically to some variation of the generalized delta rule, a specific way to train the neural network. Feedforward refers to the way the artificial neurons are connected to one another and how information flows among them. A multilayer perceptron ("MLP") is a particular neural architecture arranged as distinct layers of artificial neurons, which may or may not be trained by backpropagation. Despite referring particular aspects of a neural network, the above terms are often used interchangeably to mean a multilayer perceptron being trained by the generalized delta rule.

Anyway, MLPs with nonlinear transfer functions are indeed able to learn to map classes (such as XOR) which are not linearly separable.

These are not the only types of artificial neural networks, of course, and there are methods besides neural networks which are used to solve learning problems (tree induction, rule induction, naive Bayes, k-nearest neighbors, regression, discriminant analysis, etc.).

If you are really interested in MLP driven by backpropagation, consider the book, Neural Networks for Statistical Modeling by Murray Smith. I think it is out of print now, but I found the explanation of how all this stuff works quite accessible.



Quote:
Original post by tobinare
I'm trying to learn about basic Neural Networks and programming for them (C++). I've downloaded several source codes from tutorials etc. and libraries. They all seem to work the same by creating the network with layers and neurons and apply the backpropagtion in some manner. I've compiled several programs and run them easily enough.

As a metric to compare their functionality and ease of use I've tried to apply the typical XOR problem to each of them. It seems half of the programs handle this problem well by learning and solving for 1 and 0. The other half seem to solve for 0.5 which is the case for a Multilayer Perceptron code I have.

What's the difference between different algorithms in NN's approach to this problem? I read that the XOR problem cannot be solved by MLP but why does another progam solve it easily? I've tried reading several tutorials but I get lost in the nomenclature.

I hope someone can shed some light on Backpropagation, FeedForwad, etc.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this