Sign in to follow this  
ThousandLights

BackPropagation Help

Recommended Posts

hi guys, my first post here, excited :)
i`ve been trying to learn some of the BP algorithem
and wrote the simplest code just to be sure i understand the basics, but somehow the net output is always about the same
and i cant figure why ?
if anyone can take a look i would be very gratefull . (should be easy for someone whose familiar with bp).

as input I entered two numbers between 0-1 wich are a and b
the output should be the subtraction between them ,output= (a - b) , might be negative..
i used simple sigmoid function .
was written in VB , i pasted just the hart of the code , its just for understanding , procedural way .
some Technical stuff , i scales the output to the range of -1 to 1 , should the output neuron recive an un scaled value of the error? like i did here
or how should it be?

so can someone explain What is wrong ?

thank you..




e = 2.718281828
alpha = 0.1

'input = two numbers between 0-1


a = Cells(2, 1)
b = Cells(2, 2)



truth = a - b 'correct answer


Sum1 = a + b
Sum2 = a + b



f1 = 1 / (1 + e ^ (-Sum1))
f2 = 1 / (1 + e ^ (-Sum2))

Sum3 = f1 + W13 + f2 * W23
Sum4 = f1 * W14 + f2 * W24

f3 = 1 / (1 + e ^ (-Sum3))
f4 = 1 / (1 + e ^ (-Sum4))

Sum5 = f3 * W35 + f4 * W45
f5 = 1 / (1 + e ^ (-Sum5))



answer = -1 + f5 * 2 ' need to spred over the area : -1 till 1



'backPropagate
err5 = (truth - answer + 1) / 2
err3 = err5 * W35
err4 = err5 * W45
err1 = err3 * W13 + err4 * W14
err2 = err3 * W23 + err4 * W24

'update wieghts


W13 = W13 + alpha * (f3 * (1 - f3)) * (f1 * W13) * err3
W23 = W23 + alpha * (f3 * (1 - f3)) * (f2 * W23) * err3

W14 = W14 + alpha * (f4 * (1 - f4)) * (f1 * W14) * err4
W24 = W24 + alpha * (f4 * (1 - f4)) * (f2 * W24) * err4


W35 = W35 + alpha * (f5 * (1 - f5)) * (f3 * W35) * err5
W45 = W45 + alpha * (f5 * (1 - f5)) * (f4 * W45) * err5

Share this post


Link to post
Share on other sites
Hello

I haven't look in the details your code, but you must let the output in the range ]0,1[ since you use sigmoid function,
the delta rule is then (with sigmoid function) : delta=alpha*(1-output)*output*(error)[b]*input[/b]

If you're in a hidden layer, then 'error' is the sum of the errors of the next layer (else the difference between wanted and effectively get output)

Hope that helps :rolleyes:

EDIT : I corrected the delta rule

Share this post


Link to post
Share on other sites
hi adaline .
thanx but i belive thats not it . first : how do i create an output of boolean values ? wide range values etc ? i output neouron has a 0-1 sigmoid function but it`s translated into -1 till 1 terms .
and even if it must be just 0-1 without scalling , i tried it in deffrent ways , and it didnt improve ..
about the delta , i belive its [color=#1C2837][size=2]alpha*(1-output)*output*(error)*[b]input[/b][b]i [/b]because you mentioned the formula without the input and i saw this [/size][/color][img]http://www-speech.sri.com/people/anand/771/html/img344.gif[/img] while Xij is a certain input .
thus i still dont know what the problem is , and i dont really know if this net structure is sepose to be enough for this task or may be its imposible to solve this way ?
i wil be glad if someone hhas simple written code of BP wich i can learn from as an example ..


[color=#1C2837][size=2][b][sub] [/sub][/b][/size][/color]

Share this post


Link to post
Share on other sites
Excuse me you're[b] absolutely[/b] right, delta=output*(1-output)*error[b]*input[/b] (sorry I made a mistake : I forgot the input) :blink:

But you should keep the output in the range ] 0,1 [ so that you can apply it to the delta rule
Afterwards only, you can transform the output like you want


First of all, compute the errors of the output layer
Then retropropagate them (the hidden layers 'receive' the weighted sum of the errors commited by the next layer)


You're training a net so that it computes difference :
You want your net to learn to compute difference between a and b:

1) Compute a,b
2) Present a and b to the input layer
3) Compute output
4) The error committed by the net is : (a-b)-output
5) Compute output layer error
6) Propagate it to previous layers
7) adapt weights for all layers

You use sigmoid function so you have to encod your values accordingly to it
For example output<0.5 means negative ; positive else
The sigmoid function will not give you anyway the difference between a and b, but you can learn your net to detect if the result is positive or negative

Is it helping ?

Edit : I see than alpha =0.1 try just alpha=0.9 (in this case the minimum is absolute, but putting a small alpha value increases the risk to get stuck in local minimum)

Share this post


Link to post
Share on other sites
I transformed the output into those trerms, if theres closest values to 1 means big positive diffrence , the closest to 0 means negative gape and 0.5 is close to equals . The thing is that the net answers about the same answer for all cases , so i guess its beacuse of the expected values of randonm distribution .. Just a guess .
Can it be that this kind of problem is unsolveable with ANN or the net structure doesnt fit the problem ?
How can i know if iys solveable or not and what structure to choose ?
Do you know a problem wich is solvable for sure and the net structure that i shold apply?
Do you know any other problem wich is easier to solve

The fact is that it doesnt learn .
Can it br that this is a type of problam that cant be solved with ANN? Or that this net structure doesnt fit

Share this post


Link to post
Share on other sites
[quote name='ThousandLights' timestamp='1312928987' post='4846926']
Can it be that this kind of problem is unsolveable with ANN or the net structure doesnt fit the problem ?
[/quote]

Your net can actually learn to detect if the [s]sum[/s] difference is negative or positive.
Maybe you can try step by step, try first with a single neuron with 2 inputs : it must works even with a single unit
Could you post your complete code, I'll have a look in the details
:)

Share this post


Link to post
Share on other sites
If the output values are always the same, its probably being Run instead of Trained.
Verify that the initial weights are properly random, and that the output of the first iteration in Training mode shows that randomness.
If your network weights are converging on a consistent result in a single iteration, then something is wrong with your transfer function (nothing to do with the Sigmoid).

Share this post


Link to post
Share on other sites
Hmm .. Its combaind with the excel datasheet, nothing special to add except refering excel cells , but ill post it anyway when im on my comp .i belive the ann shold answer more than just positive / negatuve ,for It is only a binary question ..

Share this post


Link to post
Share on other sites
How are computed your initial weights ? ( a random value between -1 and 1 can be a good choice)

How do you compute error ?

if (a-b)<0 then wanted output is just 0
if (a-b)>0 then wanted output is 1

Did you add a bias to your units ?
You have to add another weight to your units that represent bias, do you do it ?

potential = weight1*a + weight2*b + weight3*1 (or -1, it doesn't matter)
output= sigmoid(potential)

Is it what you do ?

Share this post


Link to post
Share on other sites
thats the code :

Sub NN()

'const

e = 2.718281828
alpha = 0.25

'get Data


a = Cells(2, 1)
b = Cells(2, 2)


truth = a - b


W13 = Cells(4, 1)
W14 = Cells(4, 2)
W23 = Cells(4, 3)
W24 = Cells(4, 4)
W35 = Cells(6, 2)
W45 = Cells(6, 3)


'answer
Sum1 = a + b
Sum2 = a + b



f1 = 1 / (1 + e ^ (-Sum1))
f2 = 1 / (1 + e ^ (-Sum2))

Sum3 = f1 + W13 + f2 * W23
Sum4 = f1 * W14 + f2 * W24

f3 = 1 / (1 + e ^ (-Sum3))
f4 = 1 / (1 + e ^ (-Sum4))

Sum5 = f3 * W35 + f4 * W45
f5 = 1 / (1 + e ^ (-Sum5))
answer = -1 + f5 * 2

Cells(2, 4) = answer


'backPropagate
err5 = (truth - answer + 1) / 2
err3 = err5 * W35
err4 = err5 * W45
err1 = err3 * W13 + err4 * W14
err2 = err3 * W23 + err4 * W24

Cells(2, 5) = err5


'update


W13 = W13 + alpha * (f3 * (1 - f3)) * (f1 * W13) * err3
W23 = W23 + alpha * (f3 * (1 - f3)) * (f2 * W23) * err3

W14 = W14 + alpha * (f4 * (1 - f4)) * (f1 * W14) * err4
W24 = W24 + alpha * (f4 * (1 - f4)) * (f2 * W24) * err4


W35 = W35 + alpha * (f5 * (1 - f5)) * (f3 * W35) * err5
W45 = W45 + alpha * (f5 * (1 - f5)) * (f4 * W45) * err5


'show weghits

Cells(4, 1) = W13
Cells(4, 2) = W14
Cells(4, 3) = W23
Cells(4, 4) = W24
Cells(6, 2) = W35
Cells(6, 3) = W45
End Sub



actually i didnt add any bias , for i didnt see it in some of the written algorithems . so how does it sepose to be ?
aint the bias an error that we dont know wich is exprcted to be zero ? what should i do with it ?

Share this post


Link to post
Share on other sites
Hello :)

I quoted your code, i put in bold my changes :

Sub NN()

'const

e = 2.718281828
alpha = 0.25

'get Data


a = Cells(2, 1)
b = Cells(2, 2)

[b]
if (a-b)<0 then truth=0 else truth=1
[/b]

W13 = Cells(4, 1)
W14 = Cells(4, 2)
W23 = Cells(4, 3)
W24 = Cells(4, 4)
W35 = Cells(6, 2)
W45 = Cells(6, 3)


[b]f1 = 1 / (1 + e ^ (-a))
f2 = 1 / (1 + e ^ (-[/b][b]b))[/b]
[b]
Sum3 = f1 [/b][b]* W13 + f2 * W23[/b]
Sum4 = f1 * W14 + f2 * W24

f3 = 1 / (1 + e ^ (-Sum3))
f4 = 1 / (1 + e ^ (-Sum4))

Sum5 = f3 * W35 + f4 * W45
f5 = 1 / (1 + e ^ (-Sum5))
[b]answer =f5[/b]

Cells(2, 4) = answer


'backPropagate
err5 =[b] truth - answer[/b]
err3 = err5 * W35
err4 = err5 * W45
err1 = err3 * W13 + err4 * W14
err2 = err3 * W23 + err4 * W24

Cells(2, 5) = err5


'update


W13 = W13 + alpha * (f3 * (1 - f3)) * [b]f1[/b] * err3
W23 = W23 + alpha * (f3 * (1 - f3)) *[b]f2[/b]* err3

W14 = W14 + alpha * (f4 * (1 - f4)) *[b] f1[/b] * err4
W24 = W24 + alpha * (f4 * (1 - f4)) *[b]f2[/b]* err4


W35 = W35 + alpha * (f5 * (1 - f5)) *[b] f3[/b] * err5
W45 = W45 + alpha * (f5 * (1 - f5)) *[b] f4[/b] * err5


'show weghits

Cells(4, 1) = W13
Cells(4, 2) = W14
Cells(4, 3) = W23
Cells(4, 4) = W24
Cells(6, 2) = W35
Cells(6, 3) = W45
End Sub


In this case you don't need bias but in the general case you must add one. It can be seen as an added constant input (it translates the activation function along x axis)
To get things simpler maybe you can compute values that have not to be passes threw sigmoid in input layer (in range -1 to 1 for example)

Let me know if it works better now :rolleyes:

EDIT :
If you doubt about the capability of this net to learn this, a single unit with 2 inputs (and not bias...) can even do it : say the weight that brings a is W (>0) then the other weight is -W (whatever abs(W) is) .
When you'll be able to make this work, you can use bigger nets that approximate (a-b) with the accuracy you want (another option is to use linear activation function in output layer, it would fit better to the problem here and could effectively compute a-b)

Maybe you could study [i]Adaline network[/i] and [i]single layer perceptron[/i] . Before putting units in complex networks, manipulate as far as you can these units (try to change activation function, solve different problems with your net, try to figure out why there are multi layer networks (what are the limits of a single layer net ?)

Few years ago, I made a system than can learn to play to the snake game (with limitations). It's in VB.NET, let me know if you're interested I'll send you

EDIT : when you'll come back from vacations, I'll post a pseudo code that teaches to a single unit to converge towards (a-b) ....
BTW good vacations ! :cool:

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this