Sign in to follow this  
ScopeDynamo

Actual methods to train a backprop network?

Recommended Posts

ScopeDynamo    158
Hi, I've got my network code running. It works. I can specifiy targets for my output neurons and it'll always train so that the input becomes the desired output (Within a very low level of error usually but we're talkign 0.0xxxx difference here.) But, now I'm a loss because I'm not sure what are actually practical methods of training a network for useful real world applications would be. Most specifically I'm interested in game related usage, as I wrote it for a 3d game I'm doing. Any sites or tutorials that cover this more design-centric part of ANNS? Most tutorials I see just cover the ANN implentation or offer a simple ocr type training at best. For example, I tried to create a ANN that took two inputs value and produced a one dimensional vector between the two. I.e, Input 0 and and Input 5 would equal 1. Input 0 and input -5 would equal -1. Input 0 and input 0 would equal 0. But no method of training the network I can devise works. I tried simply training to understand 0,5 = 1, in the hope it would inverse any data inputed somehow. But this doesn't work. I tried training hundreds of random vectors, with the target automatically set by the training process. Didn't work.. So just how exactly do you train a net? Or there a series of styles you can use depending on the situration? Is backprop a good enough method of a broad range of generic uses? HELP. :)

Share this post


Link to post
Share on other sites
ScopeDynamo    158
Here's the sample code I used to try and make the vector 'finder',

How would you approach it using the functions you see here? It would really help me to understand this.


using namespace std;


double toang( double val){
return (-1)+val*2;
};

const double PI = 3.1415926;

int _tmain(int argc, _TCHAR* argv[])
{

cout << "A test of all tests"<<endl;
NeuralNet *vpu = new NeuralNet;
NInput *ax = vpu->addInput();
NInput *ay = vpu->addInput();

NOutput *xvec = vpu->addOutput();

vpu->addHiddenLayer(8);
vpu->connectNetwork();

int ind=0;
int loc=2200;
double dang=120;

ax->value(0); //input 1
ay->value(-5); // input 2//
xvec->target( -1 ); //output 1's desired output. Backprop error based learning.
while( ind<950){
double err=vpu->cycle();
if(err<0.05) break; //err = average error of last cycle.
ind++;
loc++;
if(loc>50){
cout<<(float)xvec->value()<<endl;
cout<<"Average Error:"<<err<<endl;
loc=0;

};
};


cout<<"Done!"<<endl;
//vpu->learn(false);


ax->value( 0 );
ay->value( -5 );
xvec->target(0);
double err=vpu->cycle();
cout<<"XOut:"<<xvec->value()<<endl; //does not work!
cout<<"New Error:"<<err<<endl;
while(true){

};



return 0;
}


Share this post


Link to post
Share on other sites
pinacolada    834
(Edit)

Ah I see what you're doing wrong. When you are training the network on several cases at once, you need to train each case -simultaneously-. As in, you train it on (0,5)->1 for one cycle. Then you train it on (0,-5)->-1 for one cycle. Then (0,0)->0 for one cycle. Then you go back to the first case, and repeat hundreds of times.

If you train it on (0,5)->1 for a hundred times, and then train in on (0,-5)->-1 for a hundred times, the training you do on the second case will ruin the training you did on the first case. I think the technical term for this is "catastrophic inference". It's a fundamental weakness of basic neural networks: any training you do to a network will almost always destroy the training that was already there.

Share this post


Link to post
Share on other sites
ScopeDynamo    158
Hi,

Thanks for the information.
based on what you said I did a supervisor class set up that allows me to specify samples to cycle through.

Using this the vector finder worked. I used several hardcoded samples, and genereated about 50 random ones. It pretty much works with 95% accuracy.

But, using the same technique I tried to code one which when given a 2d vector, returns the angle of the vector.
I tried both random and sequential sample sets of varying lengths, different hidden layers set ups, and even dynamic learning rate adjustment based on average error.. But it never produces a network that is even 80% accurate or even stable across the whole angle range.(the angle being a single output multiplied by 360.)

Here's the training code,


Supervisor *train = new Supervisor;
DataSet * xdat;
DataSet * ydat;
DataSet * odat;
xdat=train->addInput( ax );
ydat=train->addInput( ay );
odat=train->addOutput( xvec );
train->setNetwork( vpu );
int si=0;
//aang=0;

double xa,ya,aang;
aang=0;
while(si<180){
aang=aang+2;
if(aang<0) aang=aang+360;
if(aang>360) aang-=360;
xa=cos( (aang-180)*PI/180 );
ya=sin( (aang-180)*PI/180 );
xdat->addInputRecord( xa );
ydat->addInputRecord( ya );
odat->addTargetRecord( aang/360.0 );
// cout<<"Training Angle:"<<aang;
// cout<<" Target:"<<aang/360.0<<endl;
// cout<<"X:"<<xa<<" Y:"<<ya<<endl;
si++;
};
//exit(1);
double lerr=0;

int nop=0;
int oind=0;
while(oind<5000){
lerr=train->train( true );
oind++;
if(nop>45){
nop=0;
cout<<"Error>"<<lerr<<endl;
// cout<<"Angle is:"<<xvec->value()*360<<endl;
};
nop++;
};
exit(1);
int adex=0;
vpu->learn(false);
while( adex<360){
aang = adex;
aang-=180;
xa = cos( aang*PI/180.0 );
ya = sin( aang*PI/180.0 );
ax->value( xa );
ay->value( ya );
vpu->cycle();
cout<<"Angle is:"<<xvec->value()*360<<endl;
adex++;
};


Wondering if you notice any flaws in my technique this time around or whether the problem simply can't be solved using this kind of network? Any suggestions for where to go next? (Too many questions? :) )

Share this post


Link to post
Share on other sites
Marmakoide    132
A very simple way to train a neural network, whatever the kind of network, is... genetic algorithms.
1/ You fix by hand the structure of your neural network
2/ You encode the weight as let's say 10 bits number. So real weight = 1024 / (encoded weight + 1)
3/ You encode the parameters of each neurons the same way
4/ Evolve this bunch of bits by genetic algorithm. Good and easy frameworks are done for this, so don't programm you'r own package. My favourite one is Open Beagle in C++, ECJ in Java.

It could be slow (each set of weight must be tested on each example (or some random selected examples)) but it very robust, not as backprop.

Share this post


Link to post
Share on other sites
Timkin    864
Quote:
Original post by pinacolada
I think the technical term for this is "catastrophic inference".


It's also known as the 'moving target' problem.

Share this post


Link to post
Share on other sites
intrest86    742
What sort of function are you using on your nodes? Step, Linear, Sigmoid, or even a mix of several functions? The choice of activation function can greatly cary a networks ability to map certain kinds of functions.

Share this post


Link to post
Share on other sites
Marmakoide    132
* 3 layers perceptrons with sigmoidal or radial function could estimate any function, since you have the good number of neurons in your hidden layer
* The choice of those function could greatly improve your training time and quality
* GA could select this for you :), even if GA are slow

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this