My Neural Network Tutorials

Started by
21 comments, last by Coldon 15 years, 10 months ago
I've written a pretty large and comprehensive neural network tutorial, i thought I'd post it here in case anyone is interested: Part 1 - theory Part 2 - Implementation any comments or criticism would be welcome. [Edited by - Coldon on May 7, 2008 4:43:26 PM]

"In theory, theory and practice are the same. In Practice, they never are."
My Technical Blog : http://www.takinginitiative.net/

Advertisement
I still don't see how to use NNs practically in games. Are there any tutorials out there on that?
HAHA!

yeh, that is a problem. That is coming up once i manage to find some time.

"In theory, theory and practice are the same. In Practice, they never are."
My Technical Blog : http://www.takinginitiative.net/

I read the whole thing

1. Overall a great tutorial
2. The graphs where you have Input1 and Input2 on xy axis are a little confusing. There should be a little more description on what they represent, and maybe some discussion on what they show. I was not able to understand the other one where you have the hidden layer, and what exactly is shown on the graph with the hyper planes
3. I was also a little unclear on the purpose of the bias boxes. Why is their weight -1 ? Why not 1? What is significance of that -1? Is a network equivalent to a network with or without bias boxes? I mean... are they there for completeness, or for complexity?
4. I completely agree with you on the Object Orientation. My friends are software engineers and they have been spoon fed that bullshit for years... we run into heated discussions about that very topic all the time. They always quote extensibility and shit like that, but I will always be done a task before they can type "public static void xyz extends abc throws def {....". There are just some things where OO is not a good idea, and they REFUSE to understand that. It infuriates me.
My one remark, and its on the use of Neural Nets in general, not the tutorial. I've worked through two different Neural Net projects, and have invested a bit of thinking into it. I always come up with the idea that Neural Nets really do nothing... You need to make these gigantic assumptions, like "Hey, the data i'm trying to classify is distinguishable by the boundary of a certain curve." I took data mining, and it just seems like there is just way too much human input for it to be truly interesting.

Another thing is the use of squared error when training a network (in gradient descent). It makes it easy, but the error is unfairly waited against data that is far away from the correct point than data that is close. This always bothers me, even in something as simple as linear regression.
Quote:Original text from Coldon's tutorial
Now I’ve seen various implementations and wait for it… here comes an OO rant: I don’t understand why people feel the need to encapsulate everything in classes.
...
So below is how I structured my neural network and afaik it’s as efficient as possible. If anyone can further optimize my implementation please do!

That sounds like a challenge ;)
Can we OO-ify your code and retain the speed? I'll get back to you on that...
[edit]
On the topic of improving the speed - I'm sure some use of SIMD and multi-threading would work wonders here ;)

[Edited by - Hodgman on May 14, 2008 2:13:07 AM]
I took a look at the code and I think that there's a lot of things that could be improved.

1. Use std::vector<> instead of arrays. Should the class neuralNetwork for some reason throw an exception during construction it will leak memory like a sieve. Using vectors you don't need to write all those deletes in the destructor either and vectors conveniently know their sizes.

2. Separate conceptually different things. There's no reason why the neuralNetwork class should contain all that code for running training. The code would be better organized if the neuralNetwork class would only represent the network and have the propagation methods. Training should be handled by a separate class.

3. Keep headers small. Many of the methods in neuralNetwork class are large enough to warrant placing in a .cpp file. This speeds up compilation when using your library and makes it easier to read the headers for documentation.

4. BPNs don't usually require weights to be initialized to random values. That's just an old 'superstition'. In any case, calling srand() in the bowels of library code is not a good idea. It should be left to the main application.

5. Well commented is not the same thing as having comments all over the place. It's more important to have useful comments. Like having the comments 'train the network' before a function named trainNetwork or 'return results' before a return statement are not really useful.
Quote:Original post by SnotBob
I took a look at the code and I think that there's a lot of things that could be improved.


I agree. To add to the other points,

1) I see you pass vector<dataEntry*> by value many times, which is going to create a copy of the entire vector every time. You should pass it as a "const vector<dataEntry*> &", or non-const only when it will be used to return something.

2) If you're going to return the internal variables (like a dataSet *), you should make it const.

3) It's bad form in C++ to place a 'using namespace' in a head file, since anything that includes it will get unexpected name conflicts.

4) There's no point using #define for constants. You could just place "const double LEARNING_RATE = 0.001;" in a header and it will work the same.

Apart from that, I don't know why you had the huge rant about OO being slow and a waste of time. You seem to have encapsulated your code well enough for my liking. However it is true that most university professors overdo it a bit, trying to teach dim-witted students the merits of good development practises :)
Quote:Original post by SnotBob
I took a look at the code and I think that there's a lot of things that could be improved.

1. Use std::vector<> instead of arrays. Should the class neuralNetwork for some reason throw an exception during construction it will leak memory like a sieve. Using vectors you don't need to write all those deletes in the destructor either and vectors conveniently know their sizes.

2. Separate conceptually different things. There's no reason why the neuralNetwork class should contain all that code for running training. The code would be better organized if the neuralNetwork class would only represent the network and have the propagation methods. Training should be handled by a separate class.

3. Keep headers small. Many of the methods in neuralNetwork class are large enough to warrant placing in a .cpp file. This speeds up compilation when using your library and makes it easier to read the headers for documentation.

4. BPNs don't usually require weights to be initialized to random values. That's just an old 'superstition'. In any case, calling srand() in the bowels of library code is not a good idea. It should be left to the main application.

5. Well commented is not the same thing as having comments all over the place. It's more important to have useful comments. Like having the comments 'train the network' before a function named trainNetwork or 'return results' before a return statement are not really useful.


in my implementation, there is no need for vectors honestly, yes i have extra deletes in my constructor. There is no performance gain in switching to vectors. As for exceptions, i don't really see your point.

your idea would over complicate things, why separate it into two classes when both classes are tied together? Not to mention all the linking you'll have to do between the classes, it just adds complexity to something thats not necessary.

I don't like the separation into two files, afaik there is no major performance loss during compilation, it makes debugging a lot easier. I wish i have my one c++ optimization book here to check that but its at the office, I'll take a look tomorrow and report back.

As for initialization the weights, you're completely mistaken, if you don't initialize the weight to a random value, if you run the same data through you'll always get the same result, thats the whole reason that the weights are initialized to random values for even if the data is the same it can produce different results, for any dataset there can be multiple completely different weight sets that give the same accuracy.

For the comments i'll give you that one, i was extremely rushed when writing the NN as i was busy developing a Dynamic Niche ES at the same time. I'm probably sure there are some super dumb or redundant comments in there. I've started commenting basically out of habit now, or i'll comment something i was thinking. Another thing i tend to do, is before i code i outline all the sections of the algorithm in comments ie.

//load data//create network//train//run validation set//return results


so i often leave those things in... its a bad habit i know...

"In theory, theory and practice are the same. In Practice, they never are."
My Technical Blog : http://www.takinginitiative.net/

Quote:Original post by Hodgman
Quote:Original text from Coldon's tutorial
Now I’ve seen various implementations and wait for it… here comes an OO rant: I don’t understand why people feel the need to encapsulate everything in classes.
...
So below is how I structured my neural network and afaik it’s as efficient as possible. If anyone can further optimize my implementation please do!

That sounds like a challenge ;)
Can we OO-ify your code and retain the speed? I'll get back to you on that...
[edit]
On the topic of improving the speed - I'm sure some use of SIMD and multi-threading would work wonders here ;)


oo can be fast but as the guy mentioned college professors and "software architects" drill in pointless practices so that you end up with code like this:

(base fitness evaluator)(es.getElements().getFitnessEvaluator()).getFitnessValue().toNumeric();


just to get a fitness value from an element, sound ridiculous check out CILIB, cilib.sourceforge.net . Most of the professor in my department are shoving OO down peoples throat with no thought or reason...

"In theory, theory and practice are the same. In Practice, they never are."
My Technical Blog : http://www.takinginitiative.net/

This topic is closed to new replies.

Advertisement