Sign in to follow this  
artificial stupidity

Neural network, image recognize

Recommended Posts

Hi! Im building a neural network which is suppose to recognize some smilies. Something seems to be wrong but I cant find it. First I train the network 1000 times by calling recognize() and then train() inside a loop. After I do the same but this time only calling recognize() to see have many the NN could recognize. The percentage is allways the same, 25% which is exaclty the probability without the training. The initial values for the link-weights are set to som random real number between -1 and 1. The pixels in the picture is also normalized between -1 and 1. Anyway, I really need som help with this. I have searched my code for the error a long time and I belive I need some help :( Feel free to ask questions about my implementation. This is my first NN.
[SOURCE]

#include <iostream>
#include <fstream>
#include <time.h>
#include <cmath>
#include "Image.h"
#include "Node.h"
#include "Parser.h"
#include "Happy.h"

using namespace std;

void Happy::buildNetwork(vector<Image*> *images) {

	int w = (images->at(0))->w;
	int h = (images->at(0))->h;

	nrOfInput = w * h;
	nrOfOutput = 4;
	nrOfHidden =  (nrOfInput + nrOfOutput) / 2; 

	Input = new Node[nrOfInput];
	
	for(int i=0; i<nrOfInput; i++) {
		Node *n = new Node();
		n->initWeights(nrOfHidden);
		Input[i] = *n;				
	}

	Hidden = new Node[nrOfHidden];
	for(int i=0; i<nrOfHidden; i++) {
		Node *n = new Node();
		n->initWeights(nrOfOutput);
		Hidden[i] = *n;
	}

	Output = new Node[nrOfOutput];
	for(int i=0; i<nrOfOutput; i++) {
		Output[i] = *(new Node());
	}
}


float Happy::sigmoid(float x) {
	return 1.0 / (1.0 + exp(-x));
}


int Happy::recognize(Image *image) {

	/* Assign values to the input nodes from the image */
	for(int i=0; i<nrOfInput; i++) {
		Input[i].activation = image->pixels[i];
	}

	/* Set hidden nodes activation/output */ 
	for(int i=0; i<nrOfHidden; i++) {

		double total = 0.0;

		for(int j=0; j<nrOfInput; j++) {
			total += (Input[j].activation * Input[j].weights[i]);	
		}

		Hidden[i].activation = sigmoid(total);
	}

	/* Set output nodes activation/output */
	for(int i=0; i<nrOfOutput; i++) {

		double total = 0.0;

		for(int j=0; j<nrOfHidden; j++) {
			total += Hidden[j].activation * Hidden[j].weights[i];	
		}
		
		Output[i].activation = sigmoid(total);
	}

	/* which output has biggest value? */
	int maxIndex = 0;
	for(int i =0; i<nrOfOutput; i++) {
		if(Output[i].activation > maxIndex) {
			maxIndex = i;
		}
	}

	/* the answers is given 1-4. Thats explains the maxIndex+1 */
	if(maxIndex+1 == image->answer) {
		correct++;
	}
	
	return maxIndex;
}


void Happy::train(Image* image) {


	/* Output error */
	for(int i=0; i<nrOfOutput; i++) {
		double activation = Output[i].activation;
		if(i + 1 == image->answer) {
			Output[i].error = activation * (1 - activation) * (1 - activation); 
		} else {
			Output[i].error = activation * (1 - activation) * (0 - activation); 
		}
	}

	/* Hidden error */
	for(int i = 0; i < nrOfHidden; i++) {
		double total = 0.0;
		for(int j=0; j < nrOfOutput; j++) {
			total += Hidden[i].weights[j] * Output[j].error;	
		}
		Hidden[i].error = total;
	}


	/* Input error */
	for(int i = 0; i < nrOfInput; i++) {
		double total = 0.0;
		for(int j=0; j < nrOfHidden; j++) {
			total += Input[i].weights[j] * Hidden[j].error;	
		}
		Input[i].error = total;
	}

	/* Update weights for hidden links*/
	for(int i = 0; i < nrOfOutput; i++) {
		for(int j=0; j < nrOfHidden; j++) {
			Hidden[j].weights[i] += (learningRate * Output[i].error * Hidden[j].activation);
		}
	}

	/* Update weights for input links*/
	for(int i = 0; i < nrOfHidden; i++) {
		for(int j=0; j < nrOfInput; j++) {
			Input[j].weights[i] += (learningRate * Hidden[i].error * Input[j].activation);
		}
	}
}

Happy::Happy() {

	this->learningRate = 0.2;
}


int main(int argc, char** argv) {


	if(argc != 3) {
		cout << "you fail" << endl;
	} else {
		if(!strcmp(argv[1], "train")) {

			Parser* p = new Parser();
			vector<Image*> *images = p->parseFile(argv[2]);

			Happy *h = new Happy();
			h->buildNetwork(images);
			h->correct = 0;

			/* Train */
			for(int i = 0;i< 1000; i++) {
				h->recognize(images->at(i));
				h->train(images->at(i));
			}
			
			/* reset the counter */
			h->correct = 0;

			for(int i = 0;i< 1000; i++) {
				h->recognize(images->at(i));
			}

			cout << "Correct: " << (h->correct/1000.0) * 100 << "%" <<  endl;
		}
	}

	return 0;
}
[/SOURCE]

Share this post


Link to post
Share on other sites
Where/how do you initialise the weights and error values?

And you're doing some odd copying of Nodes - I hope you have the proper copy constructor and assignment operator?

How many inputs do you have, out of interest? (To put it another way, how big are the images?)

I would have thought some simple logging would help here. For every attempt at recognition, you can output the results before training and after training and confirm that things change in the way you expect. Merely looking at the end success rate tells you next to nothing.

Share this post


Link to post
Share on other sites
Thanks for the answers! Im a litrle worried about the weird copying.
Please look at my answer and let me know more about it :)


Quote:

Where/how do you initialise the weights and error values?

In the Node's constructor I set the error and activation to 0.
the Weights are initialised by a call Nodes function InitWeights()

Quote:

And you're doing some odd copying of Nodes - I hope you have the proper copy constructor and assignment operator?


am I? :(
In Happy.h I store the Nodes like

Node* input
Node* output
Node* hidden

and then in Happy.cpp I allocate with:

input = new Node[nrOfInput]


Quote:

How many inputs do you have, out of interest? (To put it another way, how big are the images?)


20x20



Share this post


Link to post
Share on other sites
I just noticed one more thing:

When I set the activation value for the hidden nodes , the value is allways very high or very low. The result returned from the sigmoid function is allways between 0.97 to 1 or between 0.03 to 0. Nothing inside of the intervall [0.03, 0.97](!!) Is it suppose to be like that?


for(int i=0; i<nrOfHidden; i++) {

double total = 0.0;

for(int j=0; j<nrOfInput; j++) {
total += (Input[j].activation * Input[j].weights[i]);
}

Hidden[i].activation = sigmoid(total);
}




The input values is normalized to be -1 to 1, but more values is negative then positive. (many pixels is white)

Share this post


Link to post
Share on other sites
Quote:
Original post by artificial stupidity
Quote:

And you're doing some odd copying of Nodes - I hope you have the proper copy constructor and assignment operator?


am I? :(

Yes. There's little point calling new to create an object, to just copy it straight afterwards. eg:
    Node *n = new Node();
n->initWeights(nrOfHidden);
Input[i] = *n;

This creates a Node called 'n', sets it up, then you copy those values into Input[i], which I assume you've already set up correctly. This requires that your Nodes have the correct copy semantics, which I can't comment on without seeing the Node class's implementation. Also, you've leaked the memory for 'n', by doing nothing with it after you've copied its values. Do you come from a Java background, by any chance? :) If you want a temporary, just create one on the stack with "Node n;"

Share this post


Link to post
Share on other sites
Quote:
Original post by Kylotan
Quote:
Original post by artificial stupidity
Quote:

And you're doing some odd copying of Nodes - I hope you have the proper copy constructor and assignment operator?


am I? :(

Yes. There's little point calling new to create an object, to just copy it straight afterwards. eg:
    Node *n = new Node();
n->initWeights(nrOfHidden);
Input[i] = *n;

This creates a Node called 'n', sets it up, then you copy those values into Input[i], which I assume you've already set up correctly. This requires that your Nodes have the correct copy semantics, which I can't comment on without seeing the Node class's implementation. Also, you've leaked the memory for 'n', by doing nothing with it after you've copied its values. Do you come from a Java background, by any chance? :) If you want a temporary, just create one on the stack with "Node n;"


Ok, thanks!
Is this better?
I have:
Node** input;
.h-file






#include <iostream>
#include <fstream>
#include <time.h>
#include <cmath>
#include "Image.h"
#include "Node.h"
#include "Parser.h"
#include "Happy.h"

using namespace std;

void Happy::buildNetwork(vector<Image*> *images) {

int w = (images->at(0))->w;
int h = (images->at(0))->h;

nrOfInput = w * h;
nrOfOutput = 4;
nrOfHidden = (nrOfInput + nrOfOutput) / 2;

Input = new Node *[nrOfInput];

for(int i=0; i<nrOfInput; i++) {
Node *n = new Node();
n->initWeights(nrOfHidden);
Input[i] = n;
}

Hidden = new Node *[nrOfHidden];
for(int i=0; i<nrOfHidden; i++) {
Node *n = new Node();
n->initWeights(nrOfOutput);
Hidden[i] = n;
}

Output = new Node *[nrOfOutput];
for(int i=0; i<nrOfOutput; i++) {
Output[i] = new Node();
}
}


float Happy::sigmoid(float x) {
return 1.0 / (1.0 + exp(-x));
}

int Happy::recognize(Image *image) {

/* Assign values to the input nodes from the image */
for(int i=0; i<nrOfInput; i++) {
Input[i]->activation = image->pixels[i];
}

// Set hidden nodes activation/output
for(int i=0; i<nrOfHidden; i++) {

double total = 0.0;

for(int j=0; j<nrOfInput; j++) {
total += (Input[j]->activation * Input[j]->weights[i]);
}

Hidden[i]->activation = sigmoid(total);

}

// Set output nodes activation/output
for(int i=0; i<nrOfOutput; i++) {

double total = 0.0;

for(int j=0; j<nrOfHidden; j++) {
total += Hidden[j]->activation * Hidden[j]->weights[i];
}


Output[i]->activation = sigmoid(total);


}

// which output has biggest value?
int maxIndex = 0;
for(int i =0; i<nrOfOutput; i++) {
if(Output[i]->activation > maxIndex) {
maxIndex = i;
}
}

// the answers is given 1-4. Thats explains the maxIndex+1
if(maxIndex+1 == image->answer) {
correct++;
}

return maxIndex;
}


void Happy::train(int answer) {


/* Output error */
for(int i=0; i<nrOfOutput; i++) {
double activation = Output[i]->activation;

if(i + 1 == answer) {
Output[i]->error = activation * (1 - activation) * (1 - activation);
} else {
Output[i]->error = activation * (1 - activation) * (0 - activation);
}
}

/* Hidden error */
for(int i = 0; i < nrOfHidden; i++) {
double total = 0.0;
for(int j=0; j < nrOfOutput; j++) {
total += Hidden[i]->weights[j] * Output[j]->error;
}
Hidden[i]->error = total;
}


/* Input error */
for(int i = 0; i < nrOfInput; i++) {
double total = 0.0;
for(int j=0; j < nrOfHidden; j++) {
total += Input[i]->weights[j] * Hidden[j]->error;
}
Input[i]->error = total;
}

/* Update weights for hidden links*/
for(int i = 0; i < nrOfOutput; i++) {
for(int j=0; j < nrOfHidden; j++) {
Hidden[j]->weights[i] += (learningRate * Output[i]->error * Hidden[j]->activation);
}
}

/* Update weights for input links*/
for(int i = 0; i < nrOfHidden; i++) {
for(int j=0; j < nrOfInput; j++) {
Input[j]->weights[i] += (learningRate * Hidden[i]->error * Input[j]->activation);
}
}
}

Happy::Happy() {

this->learningRate = 0.2;
}


void print(Node n) {


for(int i=0; i<10; i++) {
cout << n.weights[i] << " ";
}
cout << endl;
}

int main(int argc, char** argv) {


if(argc != 3) {
cout << "you fail" << endl;
} else {
if(!strcmp(argv[1], "train")) {

Parser* p = new Parser();
vector<Image*> *images = p->parseFile(argv[2]);

Happy *h = new Happy();
h->buildNetwork(images);
h->correct = 0;



/* Train */
for(int i = 0;i< 200; i++) {
h->recognize(images->at(i));
h->train(images->at(i)->answer);
}

/* reset the counter */
h->correct = 0;

for(int i = 0;i< 200; i++) {
h->recognize(images->at(i));
}

cout << "Correct: " << (h->correct/200.0) * 100 << "%" << endl;
}
}

return 0;
}




And by the way, do you know anything about the strange sigmoid values I wrote about earlier?

Share this post


Link to post
Share on other sites
The "strange sigmoid values" simply mean that your weights are too large. You still haven't explained how you initialize your weights.

Think of what the situation is for a neuron in the hidden layer. You have 400 inputs between -1 and 1. Let's imagine that they are uniformly distributed. If you use weights that are also uniformly distributed in [-1,+1], the sum of weights * inputs will look very close to a normal distribution with a standard deviation around 6.5. That means that typical values will saturate the sigmoid function and you'll get outputs that are almost always 1 or 0. The situation is even worse if most inputs are either +1 or -1 (standard deviation jumps to over 11). If you start with weights in [-1/20,+1/20] you'll do much better. `20' is the square root of the number of inputs.

On the programming part of it, since you are using C++, you should consider using STL containers instead of managing your own memory, particularly since you are not very familiar with memory management.

Share this post


Link to post
Share on other sites
Quote:

The "strange sigmoid values" simply mean that your weights are too large. You still haven't explained how you initialize your weights.


oh sorry, here is the weight-function and the normalizing function for the input values. I modified it with a division by 20 at the end. No differnce :(


#include <iostream>
#include <time.h>
#include "Node.h"

Node::Node() {

activation = 0.0;
error = 0.0;
}


void Node::initWeights(int nr) {

srand(time(NULL));
weights = new float[nr];
float avg = 0.0;
for(int i = 0; i<nr; i++) {
weights[i] = ((rand() % 200) / 100.0) -1.0;
weights[i] /= 20.0;
}
}






#include "Image.h"

Image::Image(string name, float* pixels, int w, int h) {

this->name = name;
this->pixels = pixels;
this->w = w;
this->h = h;

normalize();
}

void Image::normalize () {

float max = 0;

for(int i=0; i<(w*h); i++) {
if(pixels[i] > max) {
max = pixels[i];
}
}


for(int i=0; i<(w*h); i++) {
pixels[i]/= max;
pixels[i]*=2.0;
pixels[i]-=1;
pixels[i] /= 20.0;
}

}





Quote:

Think of what the situation is for a neuron in the hidden layer. You have 400 inputs between -1 and 1. Let's imagine that they are uniformly distributed. If you use weights that are also uniformly distributed in [-1,+1], the sum of weights * inputs will look very close to a normal distribution with a standard deviation around 6.5. That means that typical values will saturate the sigmoid function and you'll get outputs that are almost always 1 or 0. The situation is even worse if most inputs are either +1 or -1 (standard deviation jumps to over 11). If you start with weights in [-1/20,+1/20] you'll do much better. `20' is the square root of the number of inputs.

On the programming part of it, since you are using C++, you should consider using STL containers instead of managing your own memory, particularly since you are not very familiar with memory management.


Share this post


Link to post
Share on other sites
Dividing all the weights by 20 did nothing? That's surprising.

Can you dump what all the inputs and weights are for one random node and see what you get? It looks like you are going to have to resort to that type of diagnostic tool. You can probably also benefit from reducing the size of your problem. For instance, start with 2x2, and try to recognize "bottom-heavy" patterns, or patterns with high contrast, or simply bright patterns. Those problems are much easier and you'll be able to easily see what's going on because there are only a few tens of numbers involved.

Share this post


Link to post
Share on other sites
Quote:

Dividing all the weights by 20 did nothing? That's surprising.


Well, it did something. Now the hidden nodes activation is between 0-1 but that didnt change the fact that the result is the same after the training :(
It seems that it does not learn att all

Quote:

Can you dump what all the inputs and weights are for one random node and see what you get?


I could dump the input and weights, but then I dont know what to look for.
The numbers are small, but I have the the weight init to the bound +- 1/20.
This is 10 random weights:

-0.0176376 -0.0176381 0.00177376 0.0198532 2.38221e-44 0 0.22407 -3.25956e-05 1.726e-39 0

Quote:

It looks like you are going to have to resort to that type of diagnostic tool. You can probably also benefit from reducing the size of your problem. For instance, start with 2x2, and try to recognize "bottom-heavy" patterns, or patterns with high contrast, or simply bright patterns. Those problems are much easier and you'll be able to easily see what's going on because there are only a few tens of numbers involved.


I dont think reducing the problem size will do any good. If had some kind of result I could start with 2x2 to see how to get it to learn better and fine tone the NN. But for me, I just wanna see a one more percetage of recognizion before I start anlysing it.

Share this post


Link to post
Share on other sites
Quote:
Original post by artificial stupidity
Quote:

Dividing all the weights by 20 did nothing? That's surprising.


Well, it did something. Now the hidden nodes activation is between 0-1 but that didnt change the fact that the result is the same after the training :(
It seems that it does not learn att all


Well, you said "No differnce" before. So the problem of initially saturated values (which is a problem) is gone now.

How are you updating your weights to attempt to learn? Something like a gradient method?

Quote:
I dont think reducing the problem size will do any good. If had some kind of result I could start with 2x2 to see how to get it to learn better and fine tone the NN. But for me, I just wanna see a one more percetage of recognizion before I start anlysing it.


Reducing the size of the problem will let you follow the computations and see why it doesn't learn. Actually, I would start with 2 inputs, a single output neuron connected to them and try to make that neuron learn the function sigmoid(X+Y). In that case, you should be able to follow the algorithm with paper and a hand calculator.

Share this post


Link to post
Share on other sites
Quote:

How are you updating your weights to attempt to learn? Something like a gradient method?


I update the weight by adding learningRate multyplied by the error of the node which the links leads and also multiplied by the activation value of the node at the other end of the link. The I tried different learning rates, at this moment it is set to 0.2

Like this:


/* Update weights for hidden links*/
for (int i = 0; i < nrOfOutput; i++ ) {
for (int j = 0; j < nrOfHidden; j++ ) {
Hidden[j]->weights[i] += (learningRate * Output[i]->error * Hidden[j]->activation);
}
}

/* Update weights for input links*/
for (int i = 0; i < nrOfHidden; i++ ) {
for (int j = 0; j < nrOfInput; j++ ) {
Input[j]->weights[i] += (learningRate * Hidden[i]->error * Input[j]->activation);
}
}




Share this post


Link to post
Share on other sites
Do you have any justification of why that formula should work? I'll think about it, but in principle you can minimize the error by going a bit against the gradient of the error squared (as a function of the weights). If you need help with the math, I can try to help you there.

Share this post


Link to post
Share on other sites
Quote:
Original post by alvaro
Do you have any justification of why that formula should work? I'll think about it, but in principle you can minimize the error by going a bit against the gradient of the error squared (as a function of the weights). If you need help with the math, I can try to help you there.


I thought my way of doing this is very common way of doing this.
This is the tutorial I learned from.

http://www.generation5.org/content/2004/aiTabletOcr.asp

Do you know any other weight-update functions?

Share this post


Link to post
Share on other sites
firstly I've noticed you don't have any bias neurons, they are necessary for accurate operation of the NN, it will sort of work with out them but the accuracy will be greatly reduced. A bias neuron is a constant neuron with a -1 value.

Secondly where did you come up with that formula for deciding how many hidden neurons you need? that is a big mistake, the architecture of the network is very problem specific and also depends on the activation function used so just some random guess isn't going to work! you need to manually set this value and modify it as you see what your results for that specific architecture are.

Also a good rule of thumb for weight initialization is the following:

the weights for each layer should be initialized in the range of -rH to rH where rH is 1/sqrt( number of neurons for that layer ). This is not necessary but often help speed up learning, basically your network will learning no matter what the initial weights are, the time taken to learn will just vary.

the other problem is that you're assuming that the network will train in 1000 epoch, this is unrealistic, what you need to do is train the network till the accuracy is sufficient or the mean squared error is small enough. In your case i think 1000 is overly optimistic. NNs can often take 1000s or even tens of 1000s of epochs to train, especially since you're using a basic backpropagation training method and not using momentum to speed things up.

Also a learning rate of 0.2 may prove a little high, you could always set it a bit lower, it will increase the number of learning epochs necessary but may provide better accuracy.

the thing about NNs is that there is no method to determine the best learning rate of architecture for any problem, its all trial and error. If you want more info check out my blog, i have quite in depth tutorials on NNs and c++ source code, maybe use my code to work out a good architecture and learning rate values.

I hope you come right!

PS. your bracket style is horrible, :P why dont you line them up neatly...

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this