Neural networks fundamentally flawed?

Started by
8 comments, last by Waterlimon 9 years, 11 months ago

Hi all,

To prefix this discussion, I know NNs aren't used much in games, presumably because they can have unpredictable results and be hard to tune. The below is an article too juicy for me not to want to discuss it somewhere, and here seemed an okay place to bring it up.

Researchers found some interesting things about the stability and continuity of the mapping between inputs and output with NNs which (for me) cast some pretty big doubts on their overall usefulness for most purposes.

http://www.i-programmer.info/news/105-artificial-intelligence/7352-the-flaw-lurking-in-every-deep-neural-net.html

As far as we can tell, these are issues that don't occur (or occur much less frequently) in organic brains. A few theories on my part about this, I'd be interested to hear other perspectives:

  • Our neural net training algorithms are faulty, e.g. very different to in nature.
  • The simple layered approach is faulty, e.g. in a real brain signals can bounce around many times before producing an output, rather than hit each neuron exactly once.
  • Models neglect the time factor, e.g. we have a continuous stream of input rather than a single input, we may take time to make a judgement.
  • Our ability to act is a crucial factor for learning, e.g. we can interact with reality to confirm and clarify theories.

I welcome your thoughts.

JT

Edit: The tone of the article may be causing confusion, so I found a link to the actual paper

http://cs.nyu.edu/~zaremba/docs/understanding.pdf

Advertisement

Nope, the authors of the article are totally clueless about how to train NN's, these are textbook-examples. But this reveals why people tend to avoid using them: training NN's requires intuition on how they work.

We usually use 2 hidden layers, the first layer being the "generalization layer", but we do not attempt to guess what "aspects" or "parameters" the network has encoded in each neuron. During training we add or subtract neurons botg in the 1. and 2. hidden layer in a genetic evolution of the network. It is not possible to just generate a net and expect it to be trained for any purpose.

Another reason for not using NN's is that many bright people want to investigate their own algorithms for solving a given problem (estimating a value function), thus avoiding having to learn the black art of NN training.

Cheers,

/Thomas

www.thp-games.dk

Not to make an appeal to authority, but the paper came from a large group of published researchers, see a few examples below. It doesn't mean they're correct, but I wouldn't assume that they had a mass head accident and overlooked well-known information from the field.

http://scholar.google.co.nz/citations?user=L4bNmsMAAAAJ&hl=en

http://scholar.google.co.nz/citations?user=x04W_mMAAAAJ&hl=en

http://scholar.google.co.nz/citations?user=XCZpOcAAAAAJ&hl=en

I think that the first point about individual neurons was simply research into a popular misconception that stems from the metaphors that people use to talk about NNs. The second one is the big one for me. I find it disturbing that they managed to modify images in a way which was invisible to a human but caused misclassification, and that the modified images also fooled NNs trained on a different training dataset.

I'd be willing to guess that the sensationalistic article is more at fault than the paper here...

Read it here instead of the news: http://cs.nyu.edu/~zaremba/docs/understanding.pdf

Also - these guys are working on "a deep network with billions of parameters using tens of thousands of CPU cores" -- stuff that's relevant to them is not necessarily relevant to your hobby video game project wink.png

I find it disturbing that they managed to modify images in a way which was invisible to a human but caused misclassification, and that the modified images also fooled NNs trained on a different training dataset.

They can't, again the NN and the training is not alone. In a classification setup the net would output a probability for each of the categories it is supposed to match, i.e. one of the outputs is that the image shows a car. So it is never "a car" or "not a car", it is more like 0.23 (which has to be interpreted along w. the other outputs, this could still be the highest score among the outputs).

On the input side you need to encode your information in a way that is suited for NN's. I have not worked much with image inputs (only the textbook- alphabet &c.), but I know that you would typically make some feature extraction from the image (using ordinary image-algorithms) and feed these to the NN as input rather than simply feed in the pixels. Again you need to feed all relevant information to the NN, and you need to know how to train them, the NN will by itself filter out which input is relevant and whhich is not.

The typical training problems are:

1) the topology of the NN is insufficient: to few neurons (to represent the fratures of the problem) or too many neurons (no generalization)

2) initial net is in a wrong position in weight-space (some initial nets learn a training set in 100 iterations, others not in 100k)

3) under- or over- training.

But why are you so worried about this ?

Cheers,

/Thomas

I find the classification of car vs not-car a fair enough distinction in this case. I understand that the outputs are floating point numbers, however I find it reasonable in many applications to set a threshold on the final output because many applications need to either take an action or not take an action based on the classification results.

I take your point that there are many ways to set up a NN, however it's unclear to me whether the range of architectures chosen in the paper are a bad sample. I can also imagine that some feature extraction algorithms are not easily reversible, and therefore would not make good candidates for their study. This is not to say that adversary images would not exist (or would be any more rare), but that the task of generating them would be significantly more difficult.

I'm not worried in the "keeps me up at night" sense. I find the results interesting. It confirms that people understand NNs far less than they think. My only concern is if people rely on them too much, assuming that their properties are either logical and predictable or that they can just "figure things out" like human brains.

Neural nets are flawed, if we define flawed to mean "does not follow the model the brain provides". Dendrites are seen as simple signal absorbers from the neurons they touch, but they actually perform filtering functions. This is probably a very crucial step, and it has an implication about the role and meaning of a neuron. Essentially, instead of weights and thresholds, the actual brain uses conditions and signals.

Neural nets are flawed, if we define flawed to mean "does not follow the model the brain provides". Dendrites are seen as simple signal absorbers from the neurons they touch, but they actually perform filtering functions. This is probably a very crucial step, and it has an implication about the role and meaning of a neuron. Essentially, instead of weights and thresholds, the actual brain uses conditions and signals.

I find the title of this article very misleading, since I read in a text book published at least ten years ago (if memory serves it was this one: http://www.amazon.co.uk/Genetics-Molecular-Approach-Terry-Brown/dp/0412447304 from 1992) that dendrites were considered a major processing centre. This was presumably backed up by some fairly considerable research to be taught on a molecular biology degree and included in the textbooks. Sensationalism in science reporting. =/

To get back on topic - I suspect a more pictorial way of viewing AI (much like what Feynman did with Feynman diagrams in physics) algorithms could cause significant progress. The human mind seems to work almost always by manipulating patterns rather than working with numbers. If we can find a graphical or quasi-graphical representation/approximation of these patterns, we can start turning those graphs into numbers, which the computer can then 'understand'. i.e. The problem is a translation issue.

tl;dr - I agree with this theory from the OP - Our neural net training algorithms are faulty, e.g. very different to in nature.

The article discusses how to fool some popular types of neural network with adversarial examples. Since neural networks are trained to classify correctly the sufficiently common inputs, the existence of rare inputs that aren't classified correctly is natural and expected.

Discovering that such bad cases can be fairly easy to find and rather similar to good cases might be worrying, but it's certainly not "fundamental".

Easily computed bad examples might even be useful for supervised learning applications (ask the expert about a very informative synthetic input rather than yet another easy real input) or similar uses.

Omae Wa Mou Shindeiru

You cant just take a random blob of neurons and expect them to become intelligent after you train them enough. That works for some simple cases but for more complicated cases you need genetic algorithms or manual work to create a large scale structure, just like in the brain, where low level structures might be such simpleish neural network but at the high level everything is connected in very specific ways and every area is specialized to do some specific task.

Pick the right inputs and it will work. A random blob of pixels might not be the best input you can provide. It might work, but it might take a thousand years for a good configuration to be found.

o3o

This topic is closed to new replies.

Advertisement