Understanding Neural Net Programs

Artificial Intelligence Programming

Started by tobinare February 08, 2010 04:23 PM

17 comments, last by ibebrett 14 years, 2 months ago

1,173

February 11, 2010 10:05 AM

Sorry about the thread hijack, but I'd really like to hear some alternatives to NNs for my scenario

Quote:Original post by Kylotan
It's sane, but not necessarily optimal. There are many other ways of modelling and approximating functions, many of which will be more suitable than neural networks, especially if you know a little about the characteristics of the function.

That's the thing with my particular scenario, I don't know anything about the characteristics of the function or if there even is one. What we have is a set of data where a bunch of inputs are suspected to have some relation to a bunch of outputs. What we've been trying to do is model this relation by training various NNs with our sample data (with different subsets of the inputs/outputs, as requested by the field experts) and doing some sensitivity testing on this 'model' to get a very rough idea of how the inputs might affect the outputs and to gather some evidence of these suspected relations.

The function/relation is not expected to be linear nor even continuous and we're looking into about 10-80 inputs and 1-10 outputs, with some of the inputs being mutually dependent. I'm thoroughly aware of the downsides of this approach and I have often wondered about alternatives, but the uncertainty and variations in our scenario seem to be a perfect match for the vagueness of NNs.

I'm genuinly curious what aproach you'd recommend in a scenario like this, other than axe the project and run! [smile]

Rim van Wersch [ MDXInfo ] [ XNAInfo ] [ YouTube ] - Do yourself a favor and bookmark this excellent free online D3D/shader book!

stake

158

February 11, 2010 10:19 AM

80 dimensional space! If you suspect it to be nonlinear then look into nonlinear techniques. I would try to identify this first. NN are not the only ones for nonlinear. How about nearest neighbor approaches? There is an excellent book and software package for nonlinear dynamical analysis. The software is called TISEAN. The book should be a good guide about how to set up a solution. It is actually quite readable unlike most books on complex topics.

Sneftel

1,788

February 11, 2010 10:20 AM

When all I have is a bunch of data points and a generalized function to learn which seems strongly non-linear, my first instinct is k-NN, which is simple to implement and produces great results as long as your data covers the domain of the function well and doesn't have a bunch of outliers (and you can wrangle up a decent distance metric). If I know there's a lot of dependencies in the data, and the dimensionality is making the problem difficult to handle, I might PCA it first, though in my opinion PCA is a rather blunt instrument. Finally, if there really are serious discontinuities, I might be inclined to cluster first with k-means or SVM, then learn each cluster separately. A more expert ML guy might jump to more exotic and robust methods than these right off the bat, but these are simple to understand, simple to implement, and simple to tweak without a bunch of information-theoretic wisdom.

Or you could just toss Weka at it.

stake

158

February 11, 2010 10:21 AM

Sorry if it is not but I was under the assumption is was time series data. That is what the book explains at least.

Atridas

151

February 11, 2010 10:56 AM

Essentially, NN are a bunch of algorithms that try to learn patterns with 2 main objectives: classification and function emulation.

Classification means that you have some data (points, secuences of DNA, whathever) and you whant to distinguish between them in some set of "classes". You can "teach" then your NN to classify by showing it some examples (this is in this class, this goes to another) or showing a bunch of points and letting it decide the "best" classification.

Oh, you have to set the number of classes, this can be a problem in some cases.

Other thing NN do pretty well is to emulate a function. You have a complicated function wich you want to "predict" some points, but you don's have its matemathic description, NN come to you.

Backpropagation, FeedForwad, SOM, etc... ara names that set the architecture of the net (Backpropagation has no loops, while feedforward has) or the learning that he does (both backpropagation and feedforward do suppervised learning, while SOM goes his way)

remigius

1,173

February 12, 2010 01:27 AM

Thanks for the suggestions, I guess I've got some reading up to do. Especially on PCA [smile]

Rim van Wersch [ MDXInfo ] [ XNAInfo ] [ YouTube ] - Do yourself a favor and bookmark this excellent free online D3D/shader book!

Predictor

198

February 15, 2010 04:14 AM

Strictly speaking, backpropagation is short for "backpropagation of errors", which is a general process of sending errors measured at the neural network output back through the successive layers, but most often is used to refer specifically to some variation of the generalized delta rule, a specific way to train the neural network. Feedforward refers to the way the artificial neurons are connected to one another and how information flows among them. A multilayer perceptron ("MLP") is a particular neural architecture arranged as distinct layers of artificial neurons, which may or may not be trained by backpropagation. Despite referring particular aspects of a neural network, the above terms are often used interchangeably to mean a multilayer perceptron being trained by the generalized delta rule.

Anyway, MLPs with nonlinear transfer functions are indeed able to learn to map classes (such as XOR) which are not linearly separable.

These are not the only types of artificial neural networks, of course, and there are methods besides neural networks which are used to solve learning problems (tree induction, rule induction, naive Bayes, k-nearest neighbors, regression, discriminant analysis, etc.).

If you are really interested in MLP driven by backpropagation, consider the book, Neural Networks for Statistical Modeling by Murray Smith. I think it is out of print now, but I found the explanation of how all this stuff works quite accessible.

Quote:Original post by tobinare
I'm trying to learn about basic Neural Networks and programming for them (C++). I've downloaded several source codes from tutorials etc. and libraries. They all seem to work the same by creating the network with layers and neurons and apply the backpropagtion in some manner. I've compiled several programs and run them easily enough.

As a metric to compare their functionality and ease of use I've tried to apply the typical XOR problem to each of them. It seems half of the programs handle this problem well by learning and solving for 1 and 0. The other half seem to solve for 0.5 which is the case for a Multilayer Perceptron code I have.

What's the difference between different algorithms in NN's approach to this problem? I read that the XOR problem cannot be solved by MLP but why does another progam solve it easily? I've tried reading several tutorials but I get lost in the nomenclature.

I hope someone can shed some light on Backpropagation, FeedForwad, etc.