Jump to content
  • Advertisement
Sign in to follow this  
artificial stupidity

Neural network, image recognize

This topic is 3645 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi! Im building a neural network which is suppose to recognize some smilies. Something seems to be wrong but I cant find it. First I train the network 1000 times by calling recognize() and then train() inside a loop. After I do the same but this time only calling recognize() to see have many the NN could recognize. The percentage is allways the same, 25% which is exaclty the probability without the training. The initial values for the link-weights are set to som random real number between -1 and 1. The pixels in the picture is also normalized between -1 and 1. Anyway, I really need som help with this. I have searched my code for the error a long time and I belive I need some help :( Feel free to ask questions about my implementation. This is my first NN.
[SOURCE]

#include <iostream>
#include <fstream>
#include <time.h>
#include <cmath>
#include "Image.h"
#include "Node.h"
#include "Parser.h"
#include "Happy.h"

using namespace std;

void Happy::buildNetwork(vector<Image*> *images) {

	int w = (images->at(0))->w;
	int h = (images->at(0))->h;

	nrOfInput = w * h;
	nrOfOutput = 4;
	nrOfHidden =  (nrOfInput + nrOfOutput) / 2; 

	Input = new Node[nrOfInput];
	
	for(int i=0; i<nrOfInput; i++) {
		Node *n = new Node();
		n->initWeights(nrOfHidden);
		Input = *n;				
	}

	Hidden = new Node[nrOfHidden];
	for(int i=0; i<nrOfHidden; i++) {
		Node *n = new Node();
		n->initWeights(nrOfOutput);
		Hidden = *n;
	}

	Output = new Node[nrOfOutput];
	for(int i=0; i<nrOfOutput; i++) {
		Output = *(new Node());
	}
}


float Happy::sigmoid(float x) {
	return 1.0 / (1.0 + exp(-x));
}


int Happy::recognize(Image *image) {

	/* Assign values to the input nodes from the image */
	for(int i=0; i<nrOfInput; i++) {
		Input.activation = image->pixels;
	}

	/* Set hidden nodes activation/output */ 
	for(int i=0; i<nrOfHidden; i++) {

		double total = 0.0;

		for(int j=0; j<nrOfInput; j++) {
			total += (Input[j].activation * Input[j].weights);	
		}

		Hidden.activation = sigmoid(total);
	}

	/* Set output nodes activation/output */
	for(int i=0; i<nrOfOutput; i++) {

		double total = 0.0;

		for(int j=0; j<nrOfHidden; j++) {
			total += Hidden[j].activation * Hidden[j].weights;	
		}
		
		Output.activation = sigmoid(total);
	}

	/* which output has biggest value? */
	int maxIndex = 0;
	for(int i =0; i<nrOfOutput; i++) {
		if(Output.activation > maxIndex) {
			maxIndex = i;
		}
	}

	/* the answers is given 1-4. Thats explains the maxIndex+1 */
	if(maxIndex+1 == image->answer) {
		correct++;
	}
	
	return maxIndex;
}


void Happy::train(Image* image) {


	/* Output error */
	for(int i=0; i<nrOfOutput; i++) {
		double activation = Output.activation;
		if(i + 1 == image->answer) {
			Output.error = activation * (1 - activation) * (1 - activation); 
		} else {
			Output.error = activation * (1 - activation) * (0 - activation); 
		}
	}

	/* Hidden error */
	for(int i = 0; i < nrOfHidden; i++) {
		double total = 0.0;
		for(int j=0; j < nrOfOutput; j++) {
			total += Hidden.weights[j] * Output[j].error;	
		}
		Hidden.error = total;
	}


	/* Input error */
	for(int i = 0; i < nrOfInput; i++) {
		double total = 0.0;
		for(int j=0; j < nrOfHidden; j++) {
			total += Input.weights[j] * Hidden[j].error;	
		}
		Input.error = total;
	}

	/* Update weights for hidden links*/
	for(int i = 0; i < nrOfOutput; i++) {
		for(int j=0; j < nrOfHidden; j++) {
			Hidden[j].weights += (learningRate * Output.error * Hidden[j].activation);
		}
	}

	/* Update weights for input links*/
	for(int i = 0; i < nrOfHidden; i++) {
		for(int j=0; j < nrOfInput; j++) {
			Input[j].weights += (learningRate * Hidden.error * Input[j].activation);
		}
	}
}

Happy::Happy() {

	this->learningRate = 0.2;
}


int main(int argc, char** argv) {


	if(argc != 3) {
		cout << "you fail" << endl;
	} else {
		if(!strcmp(argv[1], "train")) {

			Parser* p = new Parser();
			vector<Image*> *images = p->parseFile(argv[2]);

			Happy *h = new Happy();
			h->buildNetwork(images);
			h->correct = 0;

			/* Train */
			for(int i = 0;i< 1000; i++) {
				h->recognize(images->at(i));
				h->train(images->at(i));
			}
			
			/* reset the counter */
			h->correct = 0;

			for(int i = 0;i< 1000; i++) {
				h->recognize(images->at(i));
			}

			cout << "Correct: " << (h->correct/1000.0) * 100 << "%" <<  endl;
		}
	}

	return 0;
}
[/SOURCE]

Share this post


Link to post
Share on other sites
Advertisement
Where/how do you initialise the weights and error values?

And you're doing some odd copying of Nodes - I hope you have the proper copy constructor and assignment operator?

How many inputs do you have, out of interest? (To put it another way, how big are the images?)

I would have thought some simple logging would help here. For every attempt at recognition, you can output the results before training and after training and confirm that things change in the way you expect. Merely looking at the end success rate tells you next to nothing.

Share this post


Link to post
Share on other sites
Thanks for the answers! Im a litrle worried about the weird copying.
Please look at my answer and let me know more about it :)


Quote:

Where/how do you initialise the weights and error values?

In the Node's constructor I set the error and activation to 0.
the Weights are initialised by a call Nodes function InitWeights()

Quote:

And you're doing some odd copying of Nodes - I hope you have the proper copy constructor and assignment operator?


am I? :(
In Happy.h I store the Nodes like

Node* input
Node* output
Node* hidden

and then in Happy.cpp I allocate with:

input = new Node[nrOfInput]


Quote:

How many inputs do you have, out of interest? (To put it another way, how big are the images?)


20x20



Share this post


Link to post
Share on other sites
I just noticed one more thing:

When I set the activation value for the hidden nodes , the value is allways very high or very low. The result returned from the sigmoid function is allways between 0.97 to 1 or between 0.03 to 0. Nothing inside of the intervall [0.03, 0.97](!!) Is it suppose to be like that?


for(int i=0; i<nrOfHidden; i++) {

double total = 0.0;

for(int j=0; j<nrOfInput; j++) {
total += (Input[j].activation * Input[j].weights);
}

Hidden.activation = sigmoid(total);
}




The input values is normalized to be -1 to 1, but more values is negative then positive. (many pixels is white)

Share this post


Link to post
Share on other sites
Quote:
Original post by artificial stupidity
Quote:

And you're doing some odd copying of Nodes - I hope you have the proper copy constructor and assignment operator?


am I? :(

Yes. There's little point calling new to create an object, to just copy it straight afterwards. eg:
    Node *n = new Node();
n->initWeights(nrOfHidden);
Input = *n;

This creates a Node called 'n', sets it up, then you copy those values into Input, which I assume you've already set up correctly. This requires that your Nodes have the correct copy semantics, which I can't comment on without seeing the Node class's implementation. Also, you've leaked the memory for 'n', by doing nothing with it after you've copied its values. Do you come from a Java background, by any chance? :) If you want a temporary, just create one on the stack with "Node n;"

Share this post


Link to post
Share on other sites
Quote:
Original post by Kylotan
Quote:
Original post by artificial stupidity
Quote:

And you're doing some odd copying of Nodes - I hope you have the proper copy constructor and assignment operator?


am I? :(

Yes. There's little point calling new to create an object, to just copy it straight afterwards. eg:
    Node *n = new Node();
n->initWeights(nrOfHidden);
Input = *n;

This creates a Node called 'n', sets it up, then you copy those values into Input, which I assume you've already set up correctly. This requires that your Nodes have the correct copy semantics, which I can't comment on without seeing the Node class's implementation. Also, you've leaked the memory for 'n', by doing nothing with it after you've copied its values. Do you come from a Java background, by any chance? :) If you want a temporary, just create one on the stack with "Node n;"


Ok, thanks!
Is this better?
I have:
Node** input;
.h-file






#include <iostream>
#include <fstream>
#include <time.h>
#include <cmath>
#include "Image.h"
#include "Node.h"
#include "Parser.h"
#include "Happy.h"

using namespace std;

void Happy::buildNetwork(vector<Image*> *images) {

int w = (images->at(0))->w;
int h = (images->at(0))->h;

nrOfInput = w * h;
nrOfOutput = 4;
nrOfHidden = (nrOfInput + nrOfOutput) / 2;

Input = new Node *[nrOfInput];

for(int i=0; i<nrOfInput; i++) {
Node *n = new Node();
n->initWeights(nrOfHidden);
Input = n;
}

Hidden = new Node *[nrOfHidden];
for(int i=0; i<nrOfHidden; i++) {
Node *n = new Node();
n->initWeights(nrOfOutput);
Hidden = n;
}

Output = new Node *[nrOfOutput];
for(int i=0; i<nrOfOutput; i++) {
Output = new Node();
}
}


float Happy::sigmoid(float x) {
return 1.0 / (1.0 + exp(-x));
}

int Happy::recognize(Image *image) {

/* Assign values to the input nodes from the image */
for(int i=0; i<nrOfInput; i++) {
Input->activation = image->pixels;
}

// Set hidden nodes activation/output
for(int i=0; i<nrOfHidden; i++) {

double total = 0.0;

for(int j=0; j<nrOfInput; j++) {
total += (Input[j]->activation * Input[j]->weights);
}

Hidden->activation = sigmoid(total);

}

// Set output nodes activation/output
for(int i=0; i<nrOfOutput; i++) {

double total = 0.0;

for(int j=0; j<nrOfHidden; j++) {
total += Hidden[j]->activation * Hidden[j]->weights;
}


Output->activation = sigmoid(total);


}

// which output has biggest value?
int maxIndex = 0;
for(int i =0; i<nrOfOutput; i++) {
if(Output->activation > maxIndex) {
maxIndex = i;
}
}

// the answers is given 1-4. Thats explains the maxIndex+1
if(maxIndex+1 == image->answer) {
correct++;
}

return maxIndex;
}


void Happy::train(int answer) {


/* Output error */
for(int i=0; i<nrOfOutput; i++) {
double activation = Output->activation;

if(i + 1 == answer) {
Output->error = activation * (1 - activation) * (1 - activation);
} else {
Output->error = activation * (1 - activation) * (0 - activation);
}
}

/* Hidden error */
for(int i = 0; i < nrOfHidden; i++) {
double total = 0.0;
for(int j=0; j < nrOfOutput; j++) {
total += Hidden->weights[j] * Output[j]->error;
}
Hidden->error = total;
}


/* Input error */
for(int i = 0; i < nrOfInput; i++) {
double total = 0.0;
for(int j=0; j < nrOfHidden; j++) {
total += Input->weights[j] * Hidden[j]->error;
}
Input->error = total;
}

/* Update weights for hidden links*/
for(int i = 0; i < nrOfOutput; i++) {
for(int j=0; j < nrOfHidden; j++) {
Hidden[j]->weights += (learningRate * Output->error * Hidden[j]->activation);
}
}

/* Update weights for input links*/
for(int i = 0; i < nrOfHidden; i++) {
for(int j=0; j < nrOfInput; j++) {
Input[j]->weights += (learningRate * Hidden->error * Input[j]->activation);
}
}
}

Happy::Happy() {

this->learningRate = 0.2;
}


void print(Node n) {


for(int i=0; i<10; i++) {
cout << n.weights << " ";
}
cout << endl;
}

int main(int argc, char** argv) {


if(argc != 3) {
cout << "you fail" << endl;
} else {
if(!strcmp(argv[1], "train")) {

Parser* p = new Parser();
vector<Image*> *images = p->parseFile(argv[2]);

Happy *h = new Happy();
h->buildNetwork(images);
h->correct = 0;



/* Train */
for(int i = 0;i< 200; i++) {
h->recognize(images->at(i));
h->train(images->at(i)->answer);
}

/* reset the counter */
h->correct = 0;

for(int i = 0;i< 200; i++) {
h->recognize(images->at(i));
}

cout << "Correct: " << (h->correct/200.0) * 100 << "%" << endl;
}
}

return 0;
}




And by the way, do you know anything about the strange sigmoid values I wrote about earlier?

Share this post


Link to post
Share on other sites
The "strange sigmoid values" simply mean that your weights are too large. You still haven't explained how you initialize your weights.

Think of what the situation is for a neuron in the hidden layer. You have 400 inputs between -1 and 1. Let's imagine that they are uniformly distributed. If you use weights that are also uniformly distributed in [-1,+1], the sum of weights * inputs will look very close to a normal distribution with a standard deviation around 6.5. That means that typical values will saturate the sigmoid function and you'll get outputs that are almost always 1 or 0. The situation is even worse if most inputs are either +1 or -1 (standard deviation jumps to over 11). If you start with weights in [-1/20,+1/20] you'll do much better. `20' is the square root of the number of inputs.

On the programming part of it, since you are using C++, you should consider using STL containers instead of managing your own memory, particularly since you are not very familiar with memory management.

Share this post


Link to post
Share on other sites
Quote:

The "strange sigmoid values" simply mean that your weights are too large. You still haven't explained how you initialize your weights.


oh sorry, here is the weight-function and the normalizing function for the input values. I modified it with a division by 20 at the end. No differnce :(


#include <iostream>
#include <time.h>
#include "Node.h"

Node::Node() {

activation = 0.0;
error = 0.0;
}


void Node::initWeights(int nr) {

srand(time(NULL));
weights = new float[nr];
float avg = 0.0;
for(int i = 0; i<nr; i++) {
weights = ((rand() % 200) / 100.0) -1.0;
weights /= 20.0;
}
}






#include "Image.h"

Image::Image(string name, float* pixels, int w, int h) {

this->name = name;
this->pixels = pixels;
this->w = w;
this->h = h;

normalize();
}

void Image::normalize () {

float max = 0;

for(int i=0; i<(w*h); i++) {
if(pixels > max) {
max = pixels;
}
}


for(int i=0; i<(w*h); i++) {
pixels/= max;
pixels*=2.0;
pixels-=1;
pixels /= 20.0;
}

}





Quote:

Think of what the situation is for a neuron in the hidden layer. You have 400 inputs between -1 and 1. Let's imagine that they are uniformly distributed. If you use weights that are also uniformly distributed in [-1,+1], the sum of weights * inputs will look very close to a normal distribution with a standard deviation around 6.5. That means that typical values will saturate the sigmoid function and you'll get outputs that are almost always 1 or 0. The situation is even worse if most inputs are either +1 or -1 (standard deviation jumps to over 11). If you start with weights in [-1/20,+1/20] you'll do much better. `20' is the square root of the number of inputs.

On the programming part of it, since you are using C++, you should consider using STL containers instead of managing your own memory, particularly since you are not very familiar with memory management.


Share this post


Link to post
Share on other sites
Dividing all the weights by 20 did nothing? That's surprising.

Can you dump what all the inputs and weights are for one random node and see what you get? It looks like you are going to have to resort to that type of diagnostic tool. You can probably also benefit from reducing the size of your problem. For instance, start with 2x2, and try to recognize "bottom-heavy" patterns, or patterns with high contrast, or simply bright patterns. Those problems are much easier and you'll be able to easily see what's going on because there are only a few tens of numbers involved.

Share this post


Link to post
Share on other sites
Quote:

Dividing all the weights by 20 did nothing? That's surprising.


Well, it did something. Now the hidden nodes activation is between 0-1 but that didnt change the fact that the result is the same after the training :(
It seems that it does not learn att all

Quote:

Can you dump what all the inputs and weights are for one random node and see what you get?


I could dump the input and weights, but then I dont know what to look for.
The numbers are small, but I have the the weight init to the bound +- 1/20.
This is 10 random weights:

-0.0176376 -0.0176381 0.00177376 0.0198532 2.38221e-44 0 0.22407 -3.25956e-05 1.726e-39 0

Quote:

It looks like you are going to have to resort to that type of diagnostic tool. You can probably also benefit from reducing the size of your problem. For instance, start with 2x2, and try to recognize "bottom-heavy" patterns, or patterns with high contrast, or simply bright patterns. Those problems are much easier and you'll be able to easily see what's going on because there are only a few tens of numbers involved.


I dont think reducing the problem size will do any good. If had some kind of result I could start with 2x2 to see how to get it to learn better and fine tone the NN. But for me, I just wanna see a one more percetage of recognizion before I start anlysing it.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!