Back propagation equation problem

Artificial Intelligence Programming

Started by Mizipzor December 29, 2005 05:25 AM

2 comments, last by cypherx 18 years, 3 months ago

247

Author

December 29, 2005 05:25 AM

My first neual networks came along just fine but when I tried to modify James Matthews source to be able to handle multilayer networks I also had to modify the training routines to make use of the back propagation system. One layer with one neuron was no problem. I used the system described on this page page. But further down on that page they state that if you are to use that system on a multilayer network you have to consider the effects on your changes, since changing a weight somewhere changes the inputs on the next layers. The last equation on that page is quite a big one, and thats the one I need to understand and implent. :P I dont know enough math to crack it, the first (simple) one went fine after some read-up on the maths. Could anyone care to explain the last one for me? As the author says "Again, for sake of simplicity I am skipping over the mathematical derivation of the delta rule. It has been proven that for neuron q in hidden layer p, delta is:" and then writes the equation, skipping the explanation. :P Since I dont know enough math, those letters and brackets are hard to understand and I could do with some extra explanation. Anyone else in the same situation? Or who has been and would like to share some of the experience?

Illco

928

December 29, 2005 06:05 AM

If you are struggling with the sigma (the E-shaped capital letter): it defines a sum. So E x_i means summing all x_i's. To be mathematically correct, it should also list the start (usually i = 1) and the end (usually i = n) of the sum.

An example:

x = { 1, 2, 3, 4, 5 }E x_i = 1 + 2 + 3 + 4 + 5 = 15

Hope that helps a little. I don't know exactly about the other parameters; x_p seems to be an input neuron that corresponds to the hidden layer and w might be a weight (or bias).

Illco

ciroknight

328

December 29, 2005 07:07 AM

Verbally, the formula reads:

(lowercase) Delta (sub P) of (Q) is equal to X (sub P) times the quantity (one minus X (sub P) times Q) times the sum of the elements W (sub P + 1) of (Q and i) times (lowercase) Delta (sub P + 1) of (i).

If it's still not clear;

The function Delta-p (Q) has an output value of the sum of W-p (which is a function of Q and i) times Delta-p(i), times X-p, times (1-Xp). If I'm reading it correctly (and obviously I hope I am), it means that somewhere in space you have a two dimensional array called "W" (the Weights of the neural net I'm guessing), with a field for the layer (i), and a specific neuron (Q). For any given value of Q, the function Delta-P(Q) should give you the sum of all of the weights in the layer underneath of Q, multiplied by (X-p*(1-X-p)).

But then of course, I'm no artificial intellegence specialist, so someone might want to correct the above. Mathmatically, however, I think I'm reading it right, so I hope that helps you out.

edit: heh, LOWERCASE delta, not capital. Geez, college was worth nothing.

cypherx

204

December 30, 2005 12:25 AM

I'll give explaining it a crack.

First, what do we need the error delta for?

We're trying to tweak the weights of the network to minimized the output error.
So if the output is too high, then a neuron with high output should have it's weight decreased. Conversely, if the output is too low, a neuron with high output should have it's weight boosted.

change in weight for neuron n = n's previous output * n's contribution to the error

That last term, n's contribution to the error, is it's error delta. (or mathematically stated, the partial derivative of the error in terms of the neuron output.

Ok, so we get to the formula for the error delta:

delta(p, q) = x(p, q) * (1 - x(p, q)) * sum{for each output i} of w(p+1, q, i) * delta(p+1, i)

where:
p is the layer
q is neuron
x(p, q) is the output of neuron q in layer p
w(p+1, q, i) is the weight between neuron q in layer p and neuron i in layer p+1

Each neuron's contribution to the error is how much it effects the neurons in the next layer (weight) times how much the next neurons contribute to the error (the error deltas for neurons in p+1)

Does that make sense?

-Alex

Back propagation equation problem

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Back propagation equation problem

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines