Archived

This topic is now archived and is closed to further replies.

Strange back propagation behaviour (picure inside)

This topic is 5142 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi! The picture below shows the output from my program trying to learn a neural network the XOR function. EDIT: what it shows is the error rate (sqrt(output - expected)**2) for the 4 different inputs of the XOR func. the strange thing is that it only "learns" for one of the outputs...all the others dont adapt. I know this is quite i shot in the dark, but does anybody have any idea about what might be going on? I have checked and double checkde my code and i dont find anything wrong.....any other suggestions about what i can try to test my ANN on? thank you... --Spencer --Spencer "Relax, this dragon is sleeping..." [edited by - spencer on November 11, 2003 8:47:28 AM]

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
You should try to find a Java applet that performs BP and compare your weights with the applet. I remember that on Generation5, there was one step of BP algorithm with weights and outputs, could be worth a look.

Share this post


Link to post
Share on other sites
quote:
Original post by Spencer
I am using one hidden layer with 2 neurons and of course one output neuron and 2 inputs...



Are you sure that the XOR problem can be solved with this neural network configuration?

-Predictor

Share this post


Link to post
Share on other sites
Your network topology is correct for this problem... at least, that is to say, the XOR problem CAN be solved with 2 input, 2 hidden and 1 output node. I know this is a silly question (but you''d be surprised how often people mess this up), but have you made sure that each node is connected to each other node in the layer above it?

Backpropogation is essentially a gradient ascent/descent method (hill climbing) in the parameter space. Your results indicate that in only one of your cases is your implementation able to perform this optimisation. Can you plot the parameter values vs iteration please, so we can see what is happening there? My first thought is that your parameter values are unstable and not converging. Could you post the error formula you used to compute the weight updates?

My second thought is that you might simply be stuck in a region of the parameter space that has a very flat gradient, making it hard to get out of that region using back-prop. How did you choose your initial estimates for the weights? Randomly? Assigned? Try different initial conditions and see what happens.

Other than that, try using the sigmoid function as your activation and see what happens.

Good luck,

Timkin

Share this post


Link to post
Share on other sites
Thank you all so much for your replies...
It seems like i found the answer....
I think the reason was that i initialized the weights to too small values. I used values between -0.1 and 0.1.
When i increased it to the range -1.0 to 1.0 the reault was much better. It still get stuck in suboptimal minimas though. Maybe a momentum term can help here?
EDIT: Oh and isnt 100,000 iterations a litlle much for such a simple problem as XOR?


So, what is the best range to initialize the weights to?

and again thank you all!


--Spencer

"Relax, this dragon is sleeping..."

[edited by - spencer on November 12, 2003 3:13:59 AM]

Share this post


Link to post
Share on other sites
tanh is often even better than the logistic activation ...

i''m often initializing the values from -.3 to .3, but that shouldnt be the reason. but be careful not to have the same value of a weight twice in a net, that might cause symetries, which prevent you from training the net correctly ... that should be the case with any reasonable random function in this small example

already tried to adjust the learnrate, maybe even while training the net ?

Share this post


Link to post
Share on other sites
quote:
Original post by Timkin
Your network topology is correct for this problem... at least, that is to say, the XOR problem CAN be solved with 2 input, 2 hidden and 1 output node.



You are, of course, correct. I was thinking of the checkerboard problem.

thanks,
Predictor



Share this post


Link to post
Share on other sites