Sign in to follow this  
Side Winder

Neural Network Trouble

Recommended Posts

Hey, I'd like to try my hand at image recognition, but before I can do that I'd like to get a firm grasp on neural networks. Right now I'm having some trouble transferring the theory I've learned into practice. I understand the main concepts of a neural network but it's getting those concepts to work in a real situation that would actually DO something. I'm aware that there are many different types of ANNs that people have come up with to succeed in different tasks; are there any specific ones that are tailored towards pattern (nb: image) recognition? Also, does anyone have any good resources for neural networks? A lot of the information I've seen on the web so far is extremely old and likely out-dated for a field such as this. Cheers.

Share this post


Link to post
Share on other sites
As far as programming, I've only ever implemented a simple form of backpropagation, so I'm certainly not very learned in the field. However, you might try checking out an existing software package before rolling your own, may help with finding the best types of networks to use. The only one I know of is emergent, but I'm sure there are others.

Share this post


Link to post
Share on other sites
Quote:
Original post by Side Winder
Right now I'm having some trouble transferring the theory I've learned into practice. I understand the main concepts of a neural network but it's getting those concepts to work in a real situation that would actually DO something.
Mmm, that's the hard part: feature selection. You want to turn your problem domain into a table of input and output numbers, with enough detail that the ANN can learn an appropriate pattern from them, without giving it so much detail that it ends up over-fitting. You could feed every pixel in; you could split the image up into sections and feed information about the average colour of each section; you've got lots of options.

Quote:
I'm aware that there are many different types of ANNs that people have come up with to succeed in different tasks; are there any specific ones that are tailored towards pattern (nb: image) recognition?
You might want to look into self-organising maps, but an image can be turned into a 1D array of numbers and classified just like any other kind of data. When picking a network type, the important thing is less the type of input data you want to use, and more the learning behaviours and recurrence relations you want to support.

Share this post


Link to post
Share on other sites
I think that in fact neural networks is not the technique of choice for image recognition. The main reasons are that NNs are very difficult to train (especially for problems with many variables, as in the case of images) and that they are very prone to overfitting. The only successful recognition algorithm I recall that uses NNs is Yann LeCun's convolutional networks, but then again, that is so specialized that it doesn't have much to do with general NNs.

If you are interested in image recognition, I'd recommend reading about OpenCV. (There are probably many other sources, OpenCV is the one which is relatively easy to read.)

Another suggestion is to think of a specific problem you'd like to solve (e.g. face recognition, or organizing photographs, or something else) and then people would be able to recommend more specifically how to approach that problem.

Share this post


Link to post
Share on other sites
Yeah, facial recognition was what I had in mind.

Gil Grissom, if not ANNs then what?

Thanks superpig, I'm reading about self-organising maps right now.

I'd rather not use any external packages for this; I'd quite like to learn the core of what's going on rather than using something that has already been programmed. I don't know if anyone's the same but instead of using the already-finished C++ STL and taking it all for granted I went through and made my own classes; sure they're not as efficient or as filled out, but I got a good grasp of what's going on in the underlying code.

Share this post


Link to post
Share on other sites
Although there has been some work on recognizers which work specifically on image data, I suggest that you are better off developing a good set of features to be extracted from the raw image data, and feeding that to a more ordinary learning system. That learning could be a neural network, or it could be alot of other things: naive Bayes, linear or quadratic discriminant, logistic regression, tree or rule induction, etc.


-Will Dwinnell
Data Mining in MATLAB

Share this post


Link to post
Share on other sites
Quote:
Original post by Gil Grissom...NNs are very difficult to train (especially for problems with many variables, as in the case of images) and that they are very prone to overfitting.


Can you give an example of such behavior? How many variables is "many"? Why would you feed raw pixel data to any machine learning system?


Share this post


Link to post
Share on other sites
Quote:
Original post by Side Winder
Yeah, facial recognition was what I had in mind.

I recommend looking at eigenfaces (http://en.wikipedia.org/wiki/Eigenfaces) and Viola-Jones face detector (http://en.wikipedia.org/wiki/Robust_real-time_object_detection) for introduction.

Quote:
Original post by Side Winder
Gil Grissom, if not ANNs then what?

It depends on the problem, but there are many other learning techniques, e.g. boosting (AdaBoost), or SVM. Graphical models (probabilistic models) became popular recently, you can try looking up Latent Dirichlet Allocation (LDA) for example.

Quote:
Original post by Side Winder
I'd rather not use any external packages for this

I recommended looking at OpenCV mainly because it's aimed at someone who is not necessarily a professional, and also because it has code examples, so that even if you don't fully "get" something, you can look at the code.

Share this post


Link to post
Share on other sites
Quote:
Original post by Predictor
Can you give an example of such behavior?

Here's one. Suppose your object has 1000 features, and suppose each feature can be one of two "variants" (e.g. a person's eyes may be green or brown, their hair may be long or short, etc.). So the target function you need to learn is (A1 or A2) and (B1 or B2) and ....

In this case, you can train the network in one of two ways. First, you can use the 1000 disjunctions as inputs to the network (e.g. feature one is (A1 or A2), feature 2 is (B1 or B2), etc. -- 1000 features total). In that case, the network learns well and its performance is perfect (see http://i27.tinypic.com/219swg9.jpg, higher curves are better performance). Of course, that's not a very realistic training scenario, since you provide part of the answer to the network.

A more realistic training scenario is to provide the 2000 individual features as inputs to the network. E.g. input 1 is A1, input 2 is A2, input 3 is B1, etc. -- 2000 inputs total. But in this case, the NN fails to learn anything useful and the performance is very poor (see the second curve in the link above).

So that's an example of how a network has all the necessary data it needs to solve the problem, and the architecture is suitable for solving it, but standard training methods (I tried back-prop and a few extensions, like Levenberg-Marquardt) fail to learn.

Quote:
Original post by Predictor
How many variables is "many"?

I'd say, as a rule of thumb, a few tens is many -- definitely 100 is many. That, of course, depends on the problem. If the problem is trivial -- e.g. just to AND of all inputs -- then it may work with 1000 features (as in the plot above). If the problem is very difficult, then a few tens of features and it doesn't work.

Quote:
Original post by Predictor
Why would you feed raw pixel data to any machine learning system?

Philosophically, because that's the data I have. Neural networks, in theory, are universal learners, so whatever other "features" I can derive from the raw pixel data, they could too. Of course, in practice that almost never works.

As a more practical answer, in some cases (like with MNIST data or the "80 million tiny images" data), I just don't know what kind of features to use (the features that are used for "normal" images wouldn't work with this data), and the number of pixels is not that huge (< 1000 pixels). So it's tempting to feed the pixels to something and see what comes out. And if you do it with a reasonable method (like LDA), it actually works -- for the MNIST data it learns the strokes the characters consist of). But NNs don't work here either.

[Edited by - Gil Grissom on July 26, 2009 2:03:38 AM]

Share this post


Link to post
Share on other sites
Gil,

The examples you give seem to require ignoring common practices in empirical modeling, such as feature selection. While one might get better results using something other than neural networks when ignoring this step, I'm not clear on why one would do so. Further, I don't see how any of this indicates over-fitting.

Regarding your original claim: "The main reasons are that NNs are very difficult to train (especially for problems with many variables, as in the case of images) and that they are very prone to overfitting.":

Though they have not outperformed alternatives in every situation, I have found that neural networks sometimes excel, haven't been slow to train, and have not exhibited overfitting.


-Will Dwinnell
Data Mining in MATLAB

Share this post


Link to post
Share on other sites
If you just want to experiment with neural networks and get the feel then use MATLAB, it has an excellent neural network toolbox with excellent documentation. And then afterwards you can translate it to your programming language of choice.

Share this post


Link to post
Share on other sites
Quote:
Original post by Predictor
The examples you give seem to require ignoring common practices in empirical modeling, such as feature selection.

I think they don't -- they just require that there are many features selected, rather than just a few. The plot I linked above -- the one with 1000 features -- that was after selecting the 1000 features from a much larger set of candidates.

Quote:
Original post by Predictor
I'm not clear on why one would do so.

As I said, the main reason would be that in a given situation you may not know how to select a better set of features than the trivial set you already have.

Quote:
Original post by Predictor
Further, I don't see how any of this indicates over-fitting.

Well, if you feed the network individual image pixels, and it works well on the training set but not on the test set, that typically indicates overfitting.

Share this post


Link to post
Share on other sites
Quote:
Original post by Gil Grissom
Quote:
Original post by Predictor
The examples you give seem to require ignoring common practices in empirical modeling, such as feature selection.

I think they don't -- they just require that there are many features selected, rather than just a few. The plot I linked above -- the one with 1000 features -- that was after selecting the 1000 features from a much larger set of candidates.

Why select so many features when there are perfectly good feature selection procedures available?


Quote:
Original post by Predictor
I'm not clear on why one would do so.

As I said, the main reason would be that in a given situation you may not know how to select a better set of features than the trivial set you already have.

One does not need to know which features are useful ahead of time any more than one needs to know regression coefficients ahead of time: That's what statistical procedures are for. There are any number of mechanical feature selection procedures which will solve this problem automatically.


Quote:
Original post by Predictor
Further, I don't see how any of this indicates over-fitting.

Well, if you feed the network individual image pixels, and it works well on the training set but not on the test set, that typically indicates overfitting.


No, that is, by definition, poor fitting. Whether it is underfitting (most likely in the case you describe), overfitting or something else altogether is another matter.

I still have not heard any evidence to demonstrate that neural networks, in general, are "very difficult to train" or that they are "very prone to overfitting". Even were overfitting a problem, it is easily overcome through some form of regularization, such as early stopping.


-Will Dwinnell
Data Mining in MATLAB

Share this post


Link to post
Share on other sites
Quote:
Original post by Predictor
Why select so many features

In that particular case, this many features were necessary to achieve adequate performance.

Quote:
Original post by Predictor
there are perfectly good feature selection procedures available?

There are any number of mechanical feature selection procedures which will solve this problem automatically.

I think you are overly optimistic. There are many cases in which all you have is the image (i.e., raw pixels), and there is no existing feature selection procedure which can take these pixels in and produce a sparse but useful set of features as an output.

But if you do know of such a procedure, I'd be thrilled to hear about it.

Quote:
Original post by Predictor
One does not need to know which features are useful ahead of time any more than one needs to know regression coefficients ahead of time: That's what statistical procedures are for.

I think one does. Otherwise, one would end up with a huge number of potentially useful features. In the example above, I started with roughly 10 million candidates. Feeding them all to a neural network would be infeasible (especially given your comment that even 1000 features is "many").

Quote:
Original post by Predictor
No, that is, by definition, poor fitting. Whether it is underfitting (most likely in the case you describe), overfitting or something else altogether is another matter.

Underfitting would imply poor training set performance, which wasn't the case in that example.

Quote:
Original post by Predictor
I still have not heard any evidence to demonstrate that neural networks, in general, are "very difficult to train" or that they are "very prone to overfitting".

This is mostly my experience from dealing with a few large object recognition problems, plus the fact that I know of only one successful application of NNs in that area. But again, if you do have examples to the contrary, I think it would be interesting to discuss them.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this