Sign in to follow this  

The maths behind back propagation

This topic is 4343 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Ive been debugging my neural net class for a week now, I cant figure out whats wrong. I thought it could be that the back propagation equation isnt implented as intended. Not being that good at reading long mathemetical equations with a bunch of greek letters, I looked up this rather simplified version at generetion5: d2(1) = x2(1)(1 - x2(1))w3(1,1)d3(1) Which in english mean (at least how I percieve it): "Delta for neuron n in layer l is = (output of neuron n in layer l) * (1 - output of neuron n in layer l) * (weight for input n in neuron n in layer l+1) * (delta for neuron n in layer l+1)" First things first, have I got that right? It sounds reasonable, the delta should be high if this neuron corresponded much to the error that the next neuron in line produced. However, there is one thing here that Im uncertain of: "weight for input n in neuron n in layer l+1" This is where we look on how much the next neuron in line consider our input, if it doesnt care much about out output, we should care to much to adjust ourselves to it. The thing that is bothering me is that its both neuron n and input n. What if we have two neurons in the next layer? Then we only look at one, the one straight ahead. For example:
x2(1) ---- w3(1,1) > x3(1)
      \  / w3(2,1) 
       ><
      /  \ w3(1,2) 
x2(2) ---- w3(2,2) > x3(2)

Here, x2(1) serve as an input to both x3(1) and x3(2) but according to the formula it will only look at x3(1), the one straight ahead. "weight for input n in neuron n in layer l+1" -> "weight for input 1 in neuron 1 in layer 3" x2(2) on the other hand will only look at the other neuron, also straight ahead: "weight for input 2 in neuron 2 in layer 3" Shouldnt back propagation consider all the other neurons of which this current neuron serve as an input to? Or is this the way it should be done? Please correct me where Im wrong.

Share this post


Link to post
Share on other sites
I think its a perceptron node net. The article which I built my net from is found here. Ive read more but that was the one I copied most of the structure from.

And what I want to do with it? Well... learn neural nets. :P The idea is to create a very dynamic class with a configurable number of inputs, hidden layers and neurons per hidden layer which I can use anywhere where I want a neural net.

Share this post


Link to post
Share on other sites
Ok Miz,

From the article, they aren't really using a hardlimiter, they are using a sigmoidal function. This is used because sigmoidal functions are differentiable.

So, for instance, let us consider a simple neural network with an input, an output, and a hidden layer. The input layer will just consist of nodes that are the input values.

Let's go through the feedforward operation (when we have the inputs and we want to see what the output produces). For simplicity, lets say we have 3 inputs (one of them bias = 1), 3 hidden nodes, and 2 outputs.

To calculate the feedforward of the hidden layer we have:

h1 = v(w11*x1+w21*x2+w31*x3)
h2 = v(w12*x1+w22*x2+w32*x3)
h3 = v(w13*x1+w23*x2+w33*x3)
where v is sigmodal: v(x) = 1/(1+exp(-x))

This is the output of the hidden layer.

Now lets go calculate the output layer (we shall call the weights here o instead of w).
y1 = v(o11*h1+o21*h2+o31*h3)
y2 = v(o12*h1+o22*h2+o32*h3)

Now that we can do feedforward of a neural network let's do training... next message. Let me know if you don't understand so far.


Share this post


Link to post
Share on other sites
To determine the backpropagation equations, we shall use gradient based optimization. In other words, we want to minimize this equation:

Error = 0.5*(t1-y1)^2 + 0.5*(t2-y2)^2

where tk is the target output for yk.

To do this, we must perform derivatives on the Error with respect to each weight. I won't go into this because you may not like derivatives. You can see the idea in my journal later if you want to take a look. I will just give you the weight update equations right now. They are:

First let
v'(x) = (1-v(x))*v(x) -- this is the derivative of v(x)

Then the weight update equations are (if I didn't make a mistake)
new_o11 = o11 + v'(o11*h1+o21*h2+o31*h3)*h1*(t1-y1)*learning_rate
new_o21 = o21 + v'(o11*h1+o21*h2+o31*h3)*h2*(t1-y1)*learning_rate
new_o31 = o31 + v'(o11*h1+o21*h2+o31*h3)*h3*(t1-y1)*learning_rate
new_o12 = o12 + v'(o12*h1+o22*h2+o32*h3)*h1*(t2-y2)*learning_rate
new_o22 = o22 + v'(o12*h1+o22*h2+o32*h3)*h2*(t2-y2)*learning_rate
new_o32 = o32 + v'(o12*h1+o22*h2+o32*h3)*h3*(t2-y2)*learning_rate

new_w11 = w11 + {[v'(o11*h1+o21*h2+o31*h3)*o11*(t1-y1)+
v'(o12*h1+o22*h2+o32*h3)*o12*(t2-y2)]}*x1*learning_rate
new_w21 = w21 + {[v'(o11*h1+o21*h2+o31*h3)*o11*(t1-y1)+
v'(o12*h1+o22*h2+o32*h3)*o12*(t2-y2)]}*x2*learning_rate
new_w31 = w31 + {[v'(o11*h1+o21*h2+o31*h3)*o11*(t1-y1)+
v'(o12*h1+o22*h2+o32*h3)*o12*(t2-y2)]}*x3*learning_rate

new_w12 = w12 + {[v'(o11*h1+o21*h2+o31*h3)*o21*(t1-y1)+
v'(o12*h1+o22*h2+o32*h3)*o22*(t2-y2)]}*x1*learning_rate
new_w22 = w22 + {[v'(o11*h1+o21*h2+o31*h3)*o21*(t1-y1)+
v'(o12*h1+o22*h2+o32*h3)*o22*(t2-y2)]}*x2*learning_rate
new_w32 = w32 + {[v'(o11*h1+o21*h2+o31*h3)*o21*(t1-y1)+
v'(o12*h1+o22*h2+o32*h3)*o22*(t2-y2)]}*x3*learning_rate

new_w13 = w13 + {[v'(o11*h1+o21*h2+o31*h3)*o31*(t1-y1)+
v'(o12*h1+o22*h2+o32*h3)*o32*(t2-y2)]}*x1*learning_rate
new_w23 = w23 + {[v'(o11*h1+o21*h2+o31*h3)*o31*(t1-y1)+
v'(o12*h1+o22*h2+o32*h3)*o32*(t2-y2)]}*x2*learning_rate
new_w33 = w33 + {[v'(o11*h1+o21*h2+o31*h3)*o31*(t1-y1)+
v'(o12*h1+o22*h2+o32*h3)*o32*(t2-y2)]}*x3*learning_rate

Pray I didn't make a mistake LOL.

[Edited by - NickGeorgia on January 26, 2006 4:48:35 PM]

Share this post


Link to post
Share on other sites
I understand the first post very well, and derivatives is no problem... if I remember it correctly, they worked like this:

y(x) = x^2
y'(x) = 2x

Right?

And I already use the sigmoid function, I thought that might work better than a hardlimiter.

That error equation of yours works slightly different from the one I found on generation5, on the site it was like this:

Error = y * (1 - y) * (t - y)

Or did I missunderstand that last part?

[edit] Getting late... Ill return to this thread tomorrow for a reread and, hopefully, Ill get some kind of huge eureka-insight. :P [smile]

Share this post


Link to post
Share on other sites
Hmm, that looks quite different from what I learned on the site. But what I got now doesnt work :P Or I cant make it work I mean. Is this correct?:

w is weight for hidden neurons
o is weight for output neurons
h is what a neuron in the hidden layer fired/outputed
x is initial input to net
t is target output of net
y is actual output of net

In my net I only use one output, made a thread about it here and I made the conclussion that nets with a single output neuron performed better. A little slower but better results.

I make a backup tomorrow and try to switch my code into using that system instead. See if I get any more luck. Ill post results here either way.

Thanks, Nick. [smile] Rating++.

[edit] Oh, and the error equation I mentioned that you wasnt sure about. I found it on the site I linked to in my second post. Go check it out if you want.

Share this post


Link to post
Share on other sites
Yep, I think you got my notation down. Since they were using a different error equation, I would imagine the equations would be different. OK, let me know how it turns out. (Hope I didn't make a mistake, but I'll check it over later)

Also here is a link on how you might do it if you wanted to use hard limiters.

Share this post


Link to post
Share on other sites
Ive read your post over and over Nick and heres what Ive come up with.

First, look at the design of my net:

Input Hidden_1 Hidden_2 Output_Neuron

x(0,0) x(1,0) x(2,0) <- Biases

x(0,1) w(0,0,0) x(1,1) w(1,0,0) x(2,1) w(2,0,0)
x(0,2) w(0,1,0) w(1,1,0) w(2,1,0) x(3,0) // Total net output
w(0,2,0) w(1,2,0) w(2,2,0)

w(0,0,1) x(1,2) w(1,0,1) x(2,2)
w(0,1,1) w(1,1,1)
w(0,2,1) w(1,2,1)



Here, x(l,n) is the ouput fron neuron n in layer l. n = 0 is bias, 1, in every layer. w(l,f,n) is the weight for input f to neuron n in layer l.

For example, x(1,1) is Sigmoid( x(0,0)*w(0,0,0)+x(0,1)*w(0,1,0)+x(0,2)*w(0,2,0) ).

x(3,0) is the total net output.

Trying to convert your formulas into one I could use, I came up with this for the output neuron:

w(2,f,0) = W(2,f,0) + v'(Input) * x(2,f) * (d-Input) * learn_rate

Where d is the desired output of the net. And Input is defied as the sum of all inputs times thier respective weight; x(2,0)*w(2,0,0)+x(2,1)*w(2,1,0)+x(2,2)*w(2,2,0). v'(x) function is the derivative function which I wrote like this:


float Deriv(float num) {return (1-num)*num;};


Is that right?

When I got to the hidden layer new weight calculations, I got completely stuck, cant come up with a good way to write a formula for it. This is the best I could do:

w(l,f,n) = w(l,f,n) + v'(Input) * w(l+1,n, ? ) * (d-NetOut) * x(l,f) * learn_rate

Here, v'(Input) is the same as above, i.e. the total input to the output neuron. But this is the same problem that made me start this thread. v'(Input) maybe instead is the total input to the neuron in the next layer.

I got so many thoughts but they are shattered so I got problems describing them. It makes sense multiplying total input in the nextlayer neuron with a weight, but fow which neuron? Neuron n here serve as the input to the next neuron so it makes sense making f=n. But for which neuron in is this? Neuron 1 in layer 1 server as input 1 for every neuron in layer 2, so which neuron should I calculate with?

Ive probably made a bunch of errors here so I stop and wait for your reply. Probably could use a break to, you know how blind you get when you look at the same problem for to long.

Thanks again for all your help.

Share this post


Link to post
Share on other sites
"Ive been debugging my neural net class for a week now, I cant figure out whats wrong. I thought it could be that the back propagation equation isnt implented as intended."

I too had similar problems a while ago. It turned out that the problem was not in the back propagation. There are many other things to which a neural network is sensitive to:

* Initial weight values. I found it a good idea to use random numbers from some configurable bounds, e.g [-0.1,+0.1]. My network actually refused to learn XOR with zero initial weights!

* Order of teaching. I found that using stochastic teaching (i.e choosing teaching samples at random) produces better results than orderly teaching (which can lead to bias). Again my network often refused to learn XOR if teaching was done in a predictive fashion (00,01,10,11).

* Learning constant. OK, it's clear that the learning constant affects the learning by a great deal - try lowering your value at first.

As with all debugging, you should first start with very simple test data, e.g learning AND or OR binary gates (a perceptron is enough for this). Then you can proceed to a little bit more complicated examples, e.g the XOR gate (this requires hidden layers), and off you go.

Good luck,
-- Mikko

Share this post


Link to post
Share on other sites
Thanks for your input uutee. [smile]

Ive added stochastic training to my network. Didnt make it better though.

I randomize the weights in the constructor so they are within -1 and 1.

I just lowered the learning constant to 0.1 and ran a whooping 500000 times. The net just gave 0.03 on every input... couldnt be more far of. :P

Ive already made a net with a single perceptron which managed to learn AND. Its now when Ive expanded the class to be of a variable size (choose how many hidden layers and how many neurons in each layer) that the problems have started to really hard to solve.

Heres the source to my net, if you (for any reason) want to look. The training algorithm was taken from a back propagation article on generation5.org (see earlier post for link) and I tried to show how the architecture (that is what id a certain weight/output had) of my net a couple of posts up. Look at that if you dont understand how its laid up from the source.

Here it is, anyway:

NeuralNet.h

#ifndef _NEURAL_NET
#define _NEURAL_NET

#include <vector>
#include <iostream>
#include <time.h>
#include <math.h>
#include <fstream>
#include <sstream>
#include <conio.h>

using namespace std;

#define LEARN_RATE 0.1
#define OUTPUTS 1

class NeuralNet {
private:
float Sigmoid(float num) {return (float)(1/(1+exp(-num)));};

vector<vector<float> > W; // weights
vector<vector<float> > X; // the neurons outputs

int Inputs, HiddenLayers, NeuronsInHidden;
float NetOut; // the total output of the net

float x(int l, int n); // output x in layer l from neuron n (l = 0 for the inputs to the net)
float w(int l, int f, int n); // weight of input f (bias = 0) for neuron n in layer l, l = 0 is hidden, l = 1 is output layer
void SetX(int l, int n, float NewX); // same as above but set the weight instead of read it
void SetW(int l, int f, int n, float NewWeight); // same as above but set the weight instead of read it

bool Debug; // debug output flag
//bool StepThrough;

public:
NeuralNet(int _Inputs, int _HiddenLayers, int _NeuronsInHidden, bool _Debug);
~NeuralNet();

void Train(vector<float> NetIn, float CorrectOutput); // same as below but trains it to
float Process(vector<float> NetIn); // takes the inputs, returns the outputs of the net

void Print();
};

#endif




NeuralNet.cpp

#include "NeuralNet.h"


NeuralNet::NeuralNet(int _Inputs, int _HiddenLayers, int _NeuronsInHidden, bool _Debug) {
Inputs = _Inputs;
HiddenLayers = _HiddenLayers;
NeuronsInHidden = _NeuronsInHidden;
NetOut = 0; // the total output of the net

Debug = _Debug;
//StepThrough = _StepThrough;

srand( (unsigned)time( NULL ) ); // seed the randomizer

W.resize(HiddenLayers+1); // setup weights

for(int l = 0; l <= HiddenLayers; ++l) {
if(l == 0) { // first layer, these inputs are the net inputs
W[l].resize((Inputs+1) * NeuronsInHidden);

for(int n = 0; n < NeuronsInHidden; ++n) {
for(int f = 0; f < (Inputs+1); ++f)
SetW(l,f,n, 2.0f*((float)rand()/(float)RAND_MAX)-0.5f);
}
}
else {
W[l].resize((NeuronsInHidden+1)*NeuronsInHidden); // hidden layers, inputs to these are the outputs from the former layer

for(int n = 0; n < NeuronsInHidden; ++n) {
for(int f = 0; f < (NeuronsInHidden+1); ++f)
SetW(l,f,n, 2.0f*((float)rand()/(float)RAND_MAX)-0.5f);
}
}
}

W[HiddenLayers].resize((NeuronsInHidden+1) * OUTPUTS); // output layer
for(int f = 0; f < ((NeuronsInHidden+1) * OUTPUTS); ++f)
SetW(HiddenLayers, f, 0, 2.0f*((float)rand()/(float)RAND_MAX)-0.5f);

// initialize the input/output holders for perceptrons
X.resize(HiddenLayers+1); // +1 for the input layer
X[0].resize(Inputs+1, 0); // the input layer

for(int l = 1; l <= HiddenLayers; ++l)
X[l].resize(NeuronsInHidden+1, 0);

// store biases
for(int l = 0; l <= HiddenLayers; ++l)
SetX(l,0,1);

if(Debug) {
cout << "--SETUP-------------\n";
for(int l = 0; l <= HiddenLayers; ++l)
cout << "W Layer " << l << " have size " << W[l].size() << endl;
cout << endl;

for(int l = 0; l <= HiddenLayers; ++l)
cout << "X Layer " << l << " have size " << X[l].size() << endl;
cout << endl;

for(int l = 0; l <= HiddenLayers; ++l) {
for(int n = 0; n < W[l].size(); ++n)
cout << "w(" << l << "," << n << ") is " << W[l][n] << endl;
cout << endl;
}

getch();
}
}

NeuralNet::~NeuralNet() {
}

float NeuralNet::w(int l, int f, int n) { // f = 0 is bias weight
if(l > W.size()) {
cout << "SetW error: Bad layer number: " << l << endl;
return 0;
}
else if(l == 0 ) { // input layer
if(((Inputs+1) * n) + f > W[l].size()) {
cout << "SetW error: Bad weight id number: " << ((Inputs+1) * n) + f << " on layer " << l << "\n\n";
return 0;
}
return W[l][((Inputs+1) * n) + f];
}
else if(l == 0 || l <= HiddenLayers) {
if(((NeuronsInHidden+1) * n) + f > W[l].size()) {
cout << "SetW error: Bad weight id number: " << ((NeuronsInHidden+1) * n) + f << " on layer " << l << "\n\n";
return 0;
}
return W[l][((NeuronsInHidden+1) * n) + f];
}

return 0; // just in case
}

void NeuralNet::SetW(int l, int f, int n, float NewWeight) {
if(l == HiddenLayers) // output layer
W[HiddenLayers][((NeuronsInHidden+1) * n) + f] = NewWeight;
else if(l == 0 ) // input layer
W[l][((Inputs+1) * n) + f] = NewWeight;
else if(l == 0 || l < HiddenLayers) // hidden layers
W[l][((NeuronsInHidden+1) * n) + f] = NewWeight;

else if(l < 0 && l > HiddenLayers) {
cout << "W Error: Bad layer number: " << l << endl;
return;
}
}

void NeuralNet::SetX(int l, int n, float NewX) {
if(l < 0 && l > HiddenLayers)
cout << "SetX error: Bad layer number: " << l << endl;
if(n >= X[l].size()) {
cout << "SetX error: Bad layer number: " << l << endl;
return;
}

// we are inside boundries
X[l][n] = NewX;
}

float NeuralNet::x(int l, int n) { // n = 0 is bias (1)
if(l < 0 && l > HiddenLayers)
cout << "X Error: Bad layer number: " << l << endl;
if(n >= X[l].size()) {
cout << "X Error: Bad neuron number: " << n << endl;
return 0;
}

// we are inside boundries
return X[l][n];
}

void NeuralNet::Train(vector<float> NetIn, float d) {
// first, process so we have the correct values stored inside the neural net
Process(NetIn);

vector<vector<float> > Delta;
Delta.resize(HiddenLayers+1); // one for the output layer to
for(int l = 0; l <= HiddenLayers; ++l) {
if(l == HiddenLayers) // output layer
Delta[l].resize(OUTPUTS, 0);
else
Delta[l].resize(NeuronsInHidden, 0);
}

// output layer delta (we only have one output now so the loop will only run once)
// d(2,0) = x(3,0)(1 - x(3,0))(d - x(3,0))
//Delta[HiddenLayers][n] = x(HiddenLayers+1,n) * (1 - x(HiddenLayers+1,n)) * (d - x(HiddenLayers+1,n));
Delta[HiddenLayers][0] = NetOut * (1 - NetOut) * (d - NetOut);

// hidden layer delta, first one before output
// d2(1) = x2(1)(1 - x2(1))w2(1,1)d3(1)
// formula: d(l,n) = x(l,n) (1 - x(l,n)) w(l+1,n,n-1) d(l+1,n)
// loop through the net backwards
for(int l = HiddenLayers-1; l >= 0; --l) {
if(l == HiddenLayers-1) { // layer directly before output layer
for(int n = 0; n < NeuronsInHidden; ++n)
Delta[l][n] = x(l+1,n+1) * (1 - x(l+1,n+1)) * w(l+1,n+1,0) * Delta[HiddenLayers][0];
}
else {
for(int n = 0; n < NeuronsInHidden; ++n)
Delta[l][n] = x(l+1,n+1) * (1 - x(l+1,n+1)) * w(l+1,n+1,n) * Delta[l+1][n];
}
}

// Delta calculated, now alter the weights (we only have one output now so the loop will only run once)
// formula: w2(0,1) = h*x1(0)*d2(1)
// formula: w(l,f,n) = h * x(l,f) * d(l,n)
for(int f = 0; f < NeuronsInHidden+1; f++)
SetW(HiddenLayers,f,0, w(HiddenLayers,f,0)+(LEARN_RATE * x(HiddenLayers,f) * Delta[HiddenLayers][0]));

// alter the weights for the hidden layers to
for(int l = 0; l < HiddenLayers; l++) {
if(l == 0) { // first layer
for(int n = 0; n < NeuronsInHidden; n++) {
for(int f = 0; f < Inputs+1; f++)
SetW(0,f,n, w(0,f,n)+(LEARN_RATE * x(0,f) * Delta[0][n]));
}
}
else {
for(int n = 0; n < NeuronsInHidden; n++) {
for(int f = 0; f < NeuronsInHidden+1; f++)
SetW(l,f,n, w(l,f,n)+(LEARN_RATE * x(l,f) * Delta[l][n]));
}
}
}

if(Debug) {
cout << "--TRAIN-------------\n";

for(int l = HiddenLayers; l >= 0; --l) {
if(l == HiddenLayers) { // output layer
for(int n = 0; n < OUTPUTS; ++n)
cout << "Delta(" << l << "," << n << ") " << Delta[l][n] << " ";
cout << endl;
}
else if(l == 0) { // input layer
for(int n = 0; n < Inputs; ++n)
cout << "Delta(" << l << "," << n << ") " << Delta[l][n] << " ";
cout << endl;
}
else {
for(int n = 0; n < NeuronsInHidden; ++n)
cout << "Delta(" << l << "," << n << ") " << Delta[l][n] << " ";
cout << endl;
}
}

cout << endl;

for(int l = 0; l <= HiddenLayers; ++l) {
for(int n = 0; n < W[l].size(); ++n)
cout << "New weight (" << l << "," << n << ") is " << W[l][n] << endl;
cout << endl;
}

getch();
}
}

float NeuralNet::Process(vector<float> NetIn) {
// reset values in net
for(int l = 0; l <= HiddenLayers; ++l) {
if(l == 0) { // input layer
for(int n = 1; n < Inputs+1; ++n)
SetX(l,n,0);
}
else {
for(int n = 1; n < NeuronsInHidden+1; ++n)
SetX(l,n,0);
}
}
NetOut = 0; // reset output neuron

// initial net inputs
for(int n = 1; n <= Inputs; ++n)
SetX(0,n,NetIn[n-1]);

// first layer
float Fire = 0; // what the neuron fires
for(int n = 1; n <= Inputs; ++n) {
for(int i = 0; i <= Inputs; ++i)
Fire += x(0, i) * w(0, i, n-1);

SetX(1,n, Sigmoid(Fire)); // store it as output
Fire = 0; // reset fire
}

// sort out the hidden layers outputs
for(int l = 0; l < HiddenLayers; l++) { // loop through layers
for(int n = 1; n <= NeuronsInHidden; n++) { // loop through hiddens, start at one so we dont overwrite the bias
for(int i = 0; i < Inputs+1; i++) // loop through inputs
Fire += x(l, i) * w(l, i, n-1); // store outputs as inputs in the next layer

SetX(l+1,n, Sigmoid(Fire)); // store it as output
Fire = 0; // reset fire
}
}

// output neuron
for(int i = 0; i <= NeuronsInHidden; i++)
NetOut = x(HiddenLayers, i) * w(HiddenLayers, i, 0);

NetOut = Sigmoid(NetOut);

// --- Calculation done ---

if(Debug) {
cout << "--PROCESS-----------\n";

for(int l = 0; l <= HiddenLayers; ++l) {
for(int n = 1; n < X[l].size(); ++n)
cout << "x(" << l << "," << n << ") = " << x(l,n) << " ";
cout << endl;
}
cout << "Netout: " << NetOut << "\n\n";
getch();
}

return NetOut;
}

void NeuralNet::Print() {
// print output
stringstream str;

// hidden layer weights
str << "Hidden W: --- ";
for(int l = 0; l < HiddenLayers; ++l) {
if(l == 0) { // first layer
str << "\n\nLayer " << l << "\n";
for(int n = 0; n < NeuronsInHidden; ++n) {
for(int i = 0; i < Inputs+1; ++i)
str << "w(" << l << "," << i << "," << n << "): " << w(l,i,n) << "\t";
str << endl;
}
}
else { // every other hidden layer
str << "\n\nLayer " << l << "\n";
for(int n = 0; n < NeuronsInHidden; ++n) {
for(int i = 0; i < NeuronsInHidden+1; ++i)
str << "w(" << l << "," << i << "," << n << "): " << w(l,i,n) << "\t";
str << endl;
}
}
str << endl;
}
str << "\n\n";

// output layer weights
str << "Output W: --- \n";
for(int n = 0; n < OUTPUTS; n++) {
for(int i = 0; i <= NeuronsInHidden; i++)
str << "w(" << HiddenLayers << "," << i << "," << n << "): " << w(HiddenLayers,i,n) << "\t";
}
str << "\n\n";

// open file
ofstream file("Net.txt");
if(!file.is_open()) {
cout << "Print failed, unable to create file: Net.txt\n";
return;
}

// print it
file << str.str();
cout << "Net data printed to file\n";
}




Main.cpp

#include <iostream>
#include <conio.h>

#include "NeuralNet.h"

int main() {
NeuralNet X(2, 2, 2, 1);
vector<float> NetIn;
NetIn.resize(2);

// Train
int s = 0; // for stochastic teaching
for(int a = 0; a < 500000; ++a) {
s = rand()%3;
cout << s << endl;

if(s == 0) {
NetIn[0] = 0; NetIn[1] = 0;
X.Train(NetIn, 0);
}
else if(s == 1) {
NetIn[0] = 1; NetIn[1] = 0;
X.Train(NetIn, 0);
}
else if(s == 2) {
NetIn[0] = 0; NetIn[1] = 1;
X.Train(NetIn, 0);
}
else if(s == 3) {
NetIn[0] = 1; NetIn[1] = 1;
X.Train(NetIn, 1);
}
}

// Output what weve learned
NetIn[0] = 0; NetIn[1] = 0;
cout << "0,0 = " << X.Process(NetIn);

NetIn[0] = 1; NetIn[1] = 0;
cout << endl << "1,0 = " << X.Process(NetIn);

NetIn[0] = 0; NetIn[1] = 1;
cout << endl << "0,1 = " << X.Process(NetIn);

NetIn[0] = 1; NetIn[1] = 1;
cout << endl << "1,1 = " << X.Process(NetIn) << "\n\n";

X.Print();

getch();
return 1;
}


Share this post


Link to post
Share on other sites

This topic is 4343 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this