Backpropagation XOR issue

Prozak · 2014-12-23T16:09:56

Hi all, would just like to know if, for a given problem, in this case the classic XOR binary problem, the network will eventually fluctuate to the solution, if given infinite time to do so, or, if due to learning rate issues, it can get stuck in over-shooting behaviors. If the learning rate should pose no problem, can it be mathematically said that the network should travel towards solution-space? Thank you.

Artificial Intelligence Programming

Started by Prozak December 12, 2014 09:23 AM

15 comments, last by alvaro 9 years, 4 months ago

alvaro

21,604

December 14, 2014 03:13 PM

I had an edit block in the previous comment, but I kept getting it wrong, so here it is (hopefully correctly):

Let me add some explanation of why the size of the initialization weights is important. Let's imagine your inputs to a particular layer are uniformly distributed random numbers between 0 and 1. If you multiply them by random weights between -1 and 1 and add N of them together, the size of the result will typically be about sqrt(N)/3 (that's the standard deviation of the resulting distribution). For large N that means that your sigmoid function will saturate very close to 0 or 1 with high probability, and it's very hard to learn anything from there, because the sigmoid function is so flat in those regions that the gradients are pretty much 0. By using weights between -1/sqrt(N) and +1/sqrt(N), the typical linear combination will have zero mean and standard deviation 1/3, so the values of the sigmoid function will have a very small probability of saturating and you can still learn how to adjust the weights usefully.

The assumption that the inputs into the layer be uniformly distributed is not very important. If instead they are independent binary inputs with probability 1/2 of being 1 and probability 1/2 of being 0, the standard deviation of the random linear combinations with weights uniformly distributed between -1/sqrt(N) and +1/sqrt(N) is 1/sqrt(6) ~= 0.40825. So the situation is qualitatively the same.

Prozak

899

Author

December 17, 2014 04:03 AM

So, you believe I should run the back-propagation always, period. Ok, I will change the code to do that.

Do you think sharing the weights after a run, and after, say, 100 runs, here, would be useful?

Thank you!

[ Mercury - An OS on your Browser | Mercury on YouTube ]

alvaro

21,604

December 17, 2014 04:23 AM

I think it would be more useful to see the weights before and after one update, together with the computations that led to that update.

Prozak

899

Author

December 19, 2014 04:33 AM

Sure, ok, I think I can do that. Gimme a day and I'll post the results... brb....

[ Mercury - An OS on your Browser | Mercury on YouTube ]

Prozak

899

Author

December 20, 2014 07:33 PM

I think I found something wrong with my code, I'm using JavaNNS to validate it right now, if I still have issues, I'll post them here soon.

For now thank you for your patience and assistance.

[ Mercury - An OS on your Browser | Mercury on YouTube ]

Prozak

899

Author

December 23, 2014 03:55 PM

Hmm, even tough I've validated my BP algorithm with JavaNNS, a neural net doesn't always achieve a result when training XOR. Is this a documented symptom?

If I place the same value on all weights, it doesn't seem to reach a solution. Seemingly, I would venture, the net enjoys more success when filled with random weights that favor a certain result. I'm off for holidays, but I will post more results here when I have them. Thanks all, and to all happy holidays with your families.

[ Mercury - An OS on your Browser | Mercury on YouTube ]

alvaro

21,604

December 23, 2014 04:09 PM

A neural network with equal weights in each layer will have equal weights in each layer after a step of gradient descent, so it will be as effective as a network with a single hidden unit per layer. This is well known.

Backpropagation XOR issue

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Backpropagation XOR issue

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines