Neural networks from beggining

Started by
5 comments, last by h8CplusplusGuru 2 years, 6 months ago

I remember old days when people started to write NNs. There were plenty of articles describing how actually to write one. Today i cant find any reilable source of information.

Basically what i want to learn is how to implement one. Lets say letter recognition NN.

So as i remember in the beggining we need to define the input.

So for training sake let it be 32x32 bool array defining a pixel where there is something or not.

Now the hard part what is next? It seems that this has to be some kind of hardcoded function that passes something by. I read one modern tutorial. It says it checks through 4 pixels horizontally and then creates a next layer based on this, with some specified weight, like a hashed value of which pixels are filled (true). Then having sorted out all these new layer neurons you can actually assign them to be desired letter?

Am i missing something?

Similar question? Hod they do face recognition? I mean by finding eyes, nose mouth position?

I would like to write everything from the beggining to get the taste of it.

Advertisement

Here is a code to classify images:

https://github.com/sjhalayka/ann_image_classification

_WeirdCat_ said:
Am i missing something?

If you are asking about tools that you can plug in and get an answer, there are tools for that.

But I think you are asking about programming one yourself.

SHORT VERSION: There are plenty of great books on the topic, including completely free amazing online textbooks like this and this.

LONG VERSION: The question is one of those types where if you don't know what to ask, you probably aren't ready for the answer. Most people are not ready for the math involved, nor will they ever study the math required.

The entire field is based on statistics and probabilistic math. Topics like image processing require knowledge of how pictures are encoded, which usually gets into signals processing and data compression; you can use image processing libraries to simplify some steps, but you'll still need to do math on them. Usually neural networks are an optional topic in the fourth or fifth year of university studies, or covered in graduate-level specialty topics.

_WeirdCat_ said:
So for training sake let it be 32x32 bool array defining a pixel where there is something or not.

When you want to recognize a pattern in noisy data that has examples, often either a multi-layer perceptron or radial basis function network is used for memorizing the data.

You have to figure out what input you plan on giving the machine, then provide lots and lots of example data sets to learn from. Usually you need the training set to be at least 10 times the number of features.

If you went with the easy route of feeding a full 32x32 array, that's 1024 features, so you'd need about 10K examples with a good mix of showing where something is, and where something is not.

You described a process of running a kernel over the array to identify some feature and extract it. A 4x4 kernel like you described would reduce it to 28x28, needing about 8K examples in your training data.

That might be exactly what you are looking for, or you might do better by using different learning algorithms, or by choosing better inputs by processing your data in other ways.

I did a bunch of this in a series of graduate level courses. It is extremely math intensive to understand and build, but once you've developed the system actually using a recognizer is quite easy.

_WeirdCat_ said:
Similar question? Hod they do face recognition? I mean by finding eyes, nose mouth position?

I did a bit of this in one class, detecting a hand that was pointing with a finger on a contrasting background. We talked about faces and facial recognition, but it gets really complex, really quickly.

For faces specifically, the code must do a bunch of image processing to identify and extract useful information. The field is called image registration, where you register where different elements are on images.

On a face, you might start by first looking for skin tones at all. Often that is done by converting from RGB to HSV color model and looking for skin tones. That is is much easier in HSV space since skin tones are similar hues regardless of a person's race, and also in bright or dim lighting. Sometimes a second pass is done in YCbCr space, and if you mask both, they can be tuned for >99% accuracy. (This was the first step in looking for a hand, seeing if there are any potentially-skin-pixels at all, building a mask to know where they are.)

Then when you think you're looking at flesh tones, you might run a kernel across it to search for near-white surrounding near-dark, those might be eyes. Then look for dark spots near each other, those might be nostrils. Look for patterns to indicate a mouth open or close. Then feed those in as inputs to figure out if you might possibly be looking at a face. You'll need to do more processing to make it rotation invariant, an upside down face is still a face. (With a hand you can identify a basically-square shape for the fist and a rectangle attached for a finger and for the arm, look for a roughly 4x4 and 4x1 dimension, and possibly have an arm as a rectangle going out as a 3-wide rectangle of arbitrary length potentially to the edge of the image. No need to recognize more fine details like eyes, noses, or mouths.)

If you want to recognize a specific face among many other faces, you'll need to register a collection of data points on a face beyond just eye, nose, and mouth points, but also cheeks, foreheads, ears, and you'll need registration data about the orientation of the head.

After you've got all the registration data, you can compare it to a database of known faces. You can then use statistical methods (including neural networks) to identify faces that are “close enough” to the sample. Naturally “close enough” requires tuning, and suffers from the curse of dimensionality; the more features you have, the more examples and processing are required to find matches.

Hi,

Not sure if I can help, but to me it seems, that if you want to do things from the ground up, you need to cover the basics first, and have a look at the perceptron, how it works, and how it can be trained to recognize rudimentary images.

The basics are really not that hard. You have a perceptron, and you have a activation function, you got weights and you train it, and you can do very simple image detection. Here is something to read on that, if you are interested:https://steemit.com/ai/@sykochica/how-image-recognition-works-part-1-basics-and-perceptrons

But perceptron's are limited.

These are from early days of AI, and you probably want to end up more sophisticated, maybe even inte deep learning.

And yes there is plenty of math involved.

The tricky thing is really not the Neural Networks themselves. They are quite simple beasts, mostly.

Inputs, outputs, weights and simulated neurons. The can be implemented in a midnight coding session, if you feel like it.

The trick is how do you train them, and even more, how do you train complex Neural Networks.

Complex ones are the ones that have many "neurons", in many layers.

Training is where all the CPU / GPU goes. "Executing" or "Running" them is peanuts compared to that.

I am lightyears from being an expert on the topic, but once I managed to spend a huge amount of time, trying to

create a framework for training Neural Networks, all done in Java.

Feel free to browse (and flame) the code if you like (it's my own style of Java coding, so sue me :).

Anyway it is here: https://github.com/JoystickAndCursorKeys/SimSpot

The actual neural network part of it is quite small, the training part is huge. The demo projects can be anything

but I only implemented a virtual pet, that wanders around in mazes.

Anyway the code is not exactly what you want, but maybe some part is helpful. And if you want I am happy to answer questions on it, since as I said, some years ago, I spent way to much time on playing with NNs.

Cheers

/C

Chao55 / Retro Games, Programming, AI, Space and Robots

Currently working on HyperPyxel paint program - http://www.hyperpyxel.com​​ and an asteroids clone under my "Game1" javascript game "engine".

Ok thiseasy to find how to do. Google backpropogation cpp

you need to write a backpropogation algorithm in cpp.

I can post my code though there are bugs I haven’t had time to fix, it feedsforward and backpropogates , I will when I can.

Hello, I did this code as part of a 2-day competition, so it is first draft and I haven't had time to work on it; there are some issues I will mention. The issue is that it works: feedforward and back propagates; but only when all the layers are the same size, input size = process size = output size. If you change that, there is a memory error. The issue has to do with the algorithms matrix math concept, which needs to be worked on. Besides that, it works well enough to give u a large start toward writing a neural network in c++. (I think its kind of a nice thing to be posting this code, it is special; but here it is anyway.)

Good Luck and I don't know if I'll be able to help u lately. Here is the file as a paste:

/*
	Nicholas A Komsa
	4-3-21
	neural network rough draft for ysu hackthon
	

*/
//made referencing this wonderful guide: https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/

#include <array> 
#include <random>
#include <iostream>
#include <armadillo>


class Network {
public:
	struct Layer {

		arma::mat results;
		arma::mat weights;
		arma::mat biases;

		double errorPrime{ 0 };
		arma::mat updatedWeights;

		Layer() = default;
		Layer(const arma::mat& results)
			: results(results)
		{}
		Layer(const arma::mat& weights, const arma::mat& biases)
			:weights(weights)
			, biases(biases)
			, updatedWeights(weights)
			, results( biases )
		{}
	};

	std::vector<Layer> mLayers;

	double sigmoid(double z) {
		return 1.0 / (1.0 + std::exp(-z));
	}

	double sigmoidPrime(double out) {
		return out * (1.0 - out);
	}

	//this is the error
	double errorPrime(double out, double target) {
		return out - target;
	}

	double netPrime(float out) {
		return out;
	}

public:
	double totalPrimeOutput(double out, double target, double net) {
		return errorPrime(out, target) * sigmoidPrime(out) * netPrime(net);
	}

public:


	void setNetwork(std::vector<Layer >&& layers) {

 		mLayers = std::move(layers);
	}

	arma::vec error(const arma::vec& desired, const arma::vec& current) {
		//squared error function
		arma::vec diff(desired.size());
		for (int i = 0; i < desired.size(); ++i) {
			
			double d = desired[i] - current[i];
			diff[i] = 0.5 * d * d;

		}

		return diff;
	}

	const arma::mat& feedForward() {

		for (int i = 1; i < mLayers.size(); ++i) {

			mLayers[i].results = mLayers[i-1].results * mLayers[i].weights + mLayers[i].biases;

			mLayers[i].results.for_each([&](auto& v) {

				v = sigmoid(v);

				});
		}

		return mLayers.back().results;
	}
	
	void setup(const std::vector<int>& layerSizes) {

		//layer 1 = input;
		//layer last = output

		arma::mat inputLayer(1, layerSizes.front() );
		
		mLayers.push_back({ inputLayer });

		std::uniform_real_distribution<double> reals(0.1,0.9);
		std::random_device rd;
		std::mt19937 random{ rd() };

		for (int l = 1; l < layerSizes.size(); ++l) {

			int rows = layerSizes[l-1], columns = layerSizes[l];

			mLayers.push_back({ arma::mat(rows, columns), arma::mat(1, columns) } );

			mLayers[l].weights.for_each([&](auto& v) {
				v = reals(random);
			});
			mLayers[l].biases.for_each([&](auto& v) {
				v = reals(random);
			});
		}
	}
};

int main() {

	Network network;

	/*
	
	arma::mat l1(2, 2);

	l1 = {  {0.15, 0.25},
			{0.2,  0.3} };

	arma::mat b1(1, 2);

	b1 = { {0.35, 0.35} };

	arma::mat l2(2, 2);

	l2 = { {0.4,	0.5},
		   {0.45,   0.55} };

	arma::mat b2(1, 2);

	b2 = { {0.6, 0.6} };

	arma::mat input(1, 2);

	input = { {0.05, 0.10} };
	network.setNetwork({ input, {l1, b1 }, {l2,b2} });
	*/
	

	network.setup({ 3, 3, 3, 3, 3 });

	std::cout << "\ninput is: ";
	network.mLayers.front().results.for_each([&](auto& v) {
		std::cout << v << ", ";

		});
	std::cout << std::endl;

	const int trialNum = 1;

	for (int trial = 0; trial < trialNum; ++trial) {

		bool printData = false;
		if (trial == 0 || trial == trialNum - 1) printData = true;

		auto& output = network.feedForward();

		if (printData) {
			std::cout << "output: ";
			output.for_each([&](auto& v) {

				std::cout << v << ", ";

				});
			std::cout << std::endl;
		}

		arma::vec desired = { 0.06, 0.1, 0.73 };

		if (printData) {
			//we only have one training data atm : transform input into desired
			std::cout << "Desired output: ";
			desired.for_each([&](auto& v) {
				std::cout << v << ", ";
				});


			arma::vec error = network.error(desired, arma::vectorise(output));

			std::cout << "\nerror: ";
			double totalError = 0;
			error.for_each([&](auto& v) {

				totalError += v;
				std::cout << v << ", ";

				});
			std::cout << "total error: " << totalError << std::endl;
		}


		//feed the output back into the network
		auto backpropOutput = [&]() {

			int layer = network.mLayers.size() - 2;

			for (int node = 0; node < network.mLayers[layer].results.n_cols; ++node) {

				double result = network.totalPrimeOutput(network.mLayers[layer + 1].results[node], desired[node], network.mLayers[layer].results[node]);

				for (int weight = 0; weight < network.mLayers[layer].weights.n_cols; ++weight) {

					network.mLayers[layer + 1].updatedWeights.at(weight, node) = network.mLayers[layer + 1].weights.at(weight, node) - 0.5 * result;
				}
			}

			if (printData) {
				std::cout << "\nlayer: " << layer;
				network.mLayers[layer + 1].updatedWeights.for_each([&](auto& v) {

					std::cout << "\nbackprop: " << v;

					});
				std::cout << std::endl;
			}
		};
		backpropOutput();

		//feed each layer before the output back into the network toward the input
		auto backpropLayer = [&]() {

			for (int layer = network.mLayers.size() - 3; layer >= 0; --layer) {

				auto& layer0 = network.mLayers[layer];
				auto& layer2 = network.mLayers[layer + 2];

				for (int node = 0; node < layer0.results.n_cols; ++node) {

					double e = network.errorPrime(output[node] , desired[node]) * network.sigmoidPrime(layer2.results[node]) * layer2.weights[node];

					layer0.errorPrime += e;
				}

				auto& layer1 = network.mLayers[layer + 1];

				for (int node = 0; node < layer0.results.n_cols; ++node) {

					double eTotalPrime = layer0.errorPrime * network.sigmoidPrime(layer1.results[node]) * layer0.results[node];

					for (int weight = 0; weight < layer1.weights.n_cols; ++weight) {

						layer1.updatedWeights.at(weight, node) = layer1.weights.at(weight, node) - 0.5 * eTotalPrime;
					}
				}

				if (printData) {
					std::cout << "\nlayer: " << layer;
					layer1.updatedWeights.for_each([&](auto& v) {

						std::cout << "\nbackprop: " << v;

						});
					std::cout << std::endl << std::endl;
				}
			}
		};
		backpropLayer();

		auto updateNetwork = [&]() {

			for (int layer = 0; layer < network.mLayers.size(); ++layer) {

				for (int node = 0; node < network.mLayers[layer].updatedWeights.n_elem; ++node) {

					network.mLayers[layer].weights[node] = network.mLayers[layer].updatedWeights[node];

				}
			}
		};
		updateNetwork();
	}

	//input should be unchanged:
	std::cout << "\ninput is: ";
	network.mLayers.front().results.for_each([&](auto& v) {
		std::cout << v << ", ";

		});




	return 0;
}

This topic is closed to new replies.

Advertisement