NN Hidden Layers and Expressiveness

Started by
1 comment, last by kirkd 17 years, 10 months ago
There have been a couple of threads this week on the number of hidden layers in a neural net, and I coincidentally just read the chapter of Duda on Multilayer NNs. This has brought up an interesting detail - expressiveness. It is my understanding, that one hidden layer is capable of approximating any continuous function. This gives us the ability to model an arbitrary polynomial (or non-polynomial, for that matter), and to solve the XOR problem. This all seems straight forward enough. However, I've seen mixed details regarding problems that would fall into the classification domain. Duda has a figure in which a single hidden layer is capable of classifying disjoint regions. Imagine concentric rings where the regions between circles alternate as different classes: XXXXXXXXXXXX XOOOOOOOOOOX XOXXXXXXXXOX XOXOOOOOOXOX XOXOXXXXOXOX XOXOOOOOOXOX XOXXXXXXXXOX XOOOOOOOOOOX XXXXXXXXXXXX Akin to the spiral problem, eh? But, a single continuous function will NOT divide these into individual classes, whereas a really nasty polynomial might solve the spiral problem. So, I've seen in other sources, that this one can be solved with two hidden layers which effecivtely allow us to combine disjoint regions. The first layer gives us the transform go get individual regions, and the second layer combines the disjoint regions. So, is Duda wrong?? (Heresy!) -Kirk
Advertisement

Hi,

Please notice that "layer" is a slightly misleading term, since other people talk about layers of neurons, whereas other people talk about layers of weights.

But assume we had one layer of weights, i.e. two layers of neurons. In this case it is well known that the output neurons can only depict linearly separable functions; i.e. each output neuron ("perceptron") can be seen as a hyperplane cutting through the input space.

This geometric intuition of hyperplanes can also express simple Boolean logic, i.e. all linearly separable Boolean functions, such as AND.

The next step is to add one more layer of neurons. These neurons can represent simple boolean logic on the hyperplanes represented by the middle neuron layer. This allows us to represent any _convex_ shapes.

The final step is to add yet another layer of neurons. These will represent simple Boolean logic acting on the convex functions represented by the preceding layer. This allows arbitrarily complex shapes to be represented.

-- Mikko
Yes, I should have clarified a bit. When I use the term "layers" I refer to layers fo neurons.

I understand that with no hidden layers you can only identify linearlly seperable classes. My understading was that with one hidden layer you can get an arbitrary polygon. Two hidden layers allows arbitrary groupings of multiple polygons.

However, I've seen it depicted that with one hidden layer, you get solve the concentric problem I showed above. I think this is an error.

-Kirk

This topic is closed to new replies.

Advertisement