• ### Announcements

#### Archived

This topic is now archived and is closed to further replies.

# neural network input/outputs

## Recommended Posts

Hi all, I''ve implemented a standard back-propagation neural network, and its works fine when the inputs and desired outputs are in the range 0.0- 1.0. My activation function is the standard sigmoid 1/(1+e^-x). However, if i change the activation function to 2/(1+e^-x) and have the inputs and desired patterns in the range 0.0 - 2.0, the network dosen''t work at all. I''ve changed the derivative of the activation function (used for training) from AF*(1-AF) to 2*AF*(1-AF) to take account of the changed activation function. Any ideas why this isn''t working??? My ideal goal is to have the network accept inputs/ desired outputs in the range -20.0 - 20.0 , using the activation function (2/(1+e^-x))-1.0. Is they any thing preventing a network from using both negative and positive numbers??? Any help or ideas would be greatly appreciated. hdaly.

##### Share on other sites
I know nothing practical about neural networks, but why not rescale your inputs before feeding them in to the network? Then you can have them supplied in any scale you wish.

##### Share on other sites
Cheers for that.
If in the end i have to rescale i will, but i just want to so if it will work without rescaling first. I have a feeling (not based on experience or anything!!) that by rescaling from -20.0/20.0 to 0.0/2.0, some significance of the difference between negative and positive numbers might be lost.

hdaly.

##### Share on other sites
quote:
Original post by hdaly
Cheers for that.
If in the end i have to rescale i will, but i just want to so if it will work without rescaling first. I have a feeling (not based on experience or anything!!) that by rescaling from -20.0/20.0 to 0.0/2.0, some significance of the difference between negative and positive numbers might be lost.

In most cases, inputs don't "need" to be scaled, although it is often advantageous in practice.

Scaling the output variable is the usual solution to this problem. Can you think of an example where information would be lost and how?

-Predictor
http://will.dwinnell.com

[edited by - Predictor on March 22, 2004 10:11:42 AM]

##### Share on other sites
To be honest, no i can''t give an example of information being lost, just a gut feeling really.

##### Share on other sites
Using input that includes negatives and positives has been shown to decrease the training time involved (Haykin - Neural Networks). However, I don''t think that going from -20.0 to 20.0 is really benificial (this is just my opinion though). If I were you, I would scale my inputs between -1.0-1.0 like...
scaledvector = 2*(inputvector[i]-min)/(max-min)-1;

Then change your activation function to a hyperbolic tangent (also shown to work well with negative/positive input). The general form for the activation function goes something like...

a*tanh(b*neuron_sum)

Suggested values for a,b are 1.7159 and 2.0/3 respectively.

But, as you noticed, you will have to change the gradient function to accomadate the new derivative of this activation function.

I''ll save you some time.. I think it should be:

(b/a)*(a-neuron_output)*(a+neuron_output)

neuron_output*(1.0-neuron_output) like in a normal sigmoid activation function network.

Give that a go and see if you have any more success. I used it on a digital fourieur transform that was scaled between -1.0-1.0 and I had a good deal of success with it. (Although, sadly, results from one persons experiment are not always transferable to another persons experiment due to the input data always being different in the best way that it can be seperated)

Good luck!

Ryan

##### Share on other sites
Well, If you want your outputs of the sigmoid f(x) between 0 and 2, you sould have

f(x) = 2 / (1 + exp(-x))

and

f''(x) = (1/2) * f(x) * (2 - f(x))

That should work.

Let''s make a general rule here:

To have the sigmoid function f(x) between min and max:

a = max - min
b = -min

f(x) = ( a / (1 + exp(-x)) ) - b
f''(x) = (1/a) * (b + f(x)) * (a - b - f(x))

Try this out and tell me how it goes!

##### Share on other sites
quote:
Original post by trub
Using input that includes negatives and positives has been shown to decrease the training time involved (Haykin - Neural Networks).

It''s probably worth qualifying the above, which, while true for the most common MLP implementations, will not be true for all. MLPs trained by global (or hybrid) optimization, for instance, will not be affected by such scaling one way or the other.

-Predictor
http://will.dwinnell.com