Neural Network - Discussion

Started by
102 comments, last by Kylotan 15 years, 8 months ago
Quote:Original post by ID Merlin
I don't have a reference to cite, it may have been an earlier post here, but one of the things that NNs are not good at is dealing with sequences of events.

That is correct. One way around it would be to have some of your inputs map to historical data at discrete time periods (e.g. 1 second ago, 5 seconds ago, 15 seconds ago) or to discrete points on a historical event list (e.g. CurrentEvent - 1, CurrentEvent - 2, etc.)

Quote:I know that NNs are fairly good at learning to discern various distorted characters in a CAPTCHA image, for instance, but that is hardly a good "game", is it?

The reason for this is that NNs are suited to pattern matching. For example, an OCR system will take pixelized data and use each pixel as an input. Depending on which pixels are on and which are off, it can make a reasonable assumption of which letter or number it is. It is saying, in essence, "this kinda looks like this known pattern which I have mapped to output 'A'".

For game purposes, you are not using pixels but data inputs from the world around you. Health, damage per second, number of allies, number of enemies, proximity of powerups, etc. That's great and all, but I can build the same decision model with weighted sums and/or decision trees... and in a more custom-tailored way.

But the NN can learn on the fly, you say? So can reinforcement learning or dynamic decision trees. And they can do it in a more controlled way.

*shrug*

Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC
Professional consultant on game AI, mathematical modeling, simulation modeling
Co-founder and 10 year advisor of the GDC AI Summit
Author of the book, Behavioral Mathematics for Game AI
Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play

"Reducing the world to mathematical equations!"

Advertisement
Colin McRae Rally 2 used a neural network to drive the AI cars.

I also belive that Battlecruiser 3000AD used one as well.

And the afore mentioned Black & White.

Aside from those 3 games, I'm not really aware of any others, not to say there aren't any.

CMR2 is in my mind, the perfect kind of use for a neural net though.

Ok, some intresting arguements there which I will be sure to pick up on in my writting.

Forgetting now that im actually using NN's to automate a game agent(which evidently is not the best way to solve that particular problem). Is it fair to say that NN's are good at what they do, but its difficult to find an application for them?

Andre'LaMothe of Xtreme Games worte an article about NN's and one thing that struck me was this part:

"The key to unlocking any technology is for a person or persons to create a Killer App for it. We all know how DOOM works by now, i.e. by using BSP trees. However, John Carmack didn't invent them, he read about them in a paper written in the 1960's. This paper described BSP technology. John took the next step an realized what BSP trees could be used for and DOOM was born."

So it took BSP 33years to go from theory to practice! Dont get me wrong im not expecting to be creating the next Doom using NN's but maybe the arguement could be that their not being applied correctly?

Feeling #0000FF
Quote:Original post by sion5
Forgetting now that im actually using NN's to automate a game agent(which evidently is not the best way to solve that particular problem). Is it fair to say that NN's are good at what they do, but its difficult to find an application for them?

No, it's the other way around. It's easy to find applications for NN's - they are, in a sense, the ultimate heuristic - but they are so general that they don't perform specific operations well.

The biggest drawback is one I haven't seen mentioned here, and that is that backprop NN's tend to get stuck in local minima. This is a critical weakness. It can be alleviated with various tricks such simulated annealing or other hill-climbing algorithms, but the more tricks you pull the less efficient the learning procedure becomes. The ultimate expression of this is using GA's, which you asked about. Using GA as a learning algorithm avoids the local minima problem almost completely, but you pay for it by throwing away all structural knowledge you have of the problem which manifests itself as extremely poor learning rates.

NN's may become useful if we learn how to compensate for their weaknesses without sacrificing their strengths. It has been demonstrated that NN's and other statistical methods benefit greatly from appropriate preprocessing mechanisms, but they still do not approach hand-coded methods for stability, correctness or efficiency. In the end, most statistical methods disregard the actual structure of the problem they are trying to solve, which is why they fail. Classical AI techniques are typically better suited; a combination of the two may be best (but is difficult to achieve; I'm working on that problem myself).

Then there are statistical methods that allow for structured learning as a compromize. I've had some early success (didn't pursue it further) with Bayesian networks implemented with a learning algorithm. The price in runtime performance was heavy (superexponential in regards to the number of nodes, IIRC), but the rate of learning as a function of observed data was relatively impressive.
Quote:So it took BSP 33years to go from theory to practice! Dont get me wrong im not expecting to be creating the next Doom using NN's but maybe the arguement could be that their not being applied correctly?

Could be. It is evident that NN's are very powerful - just look at ourselves. As I mentioned, NN's benefit greatly from proper preprocessing. Preprocessing can be (but usually isn't) done other NN's. This suggests that a tightly controlled hierarchical structure of NN's can achieve better results than we've seen so far. It makes sense from a theoretical perspective too, though I won't go into that here. But to actually design such a hierarchy properly is not possible at this point in time. Due to the nature of NN's there are no structural design methods to apply, and the solution space is much too large for random exploration to see any real success. I doubt much progress will be made withing the next two decades in this particular area, unless a breakthrough of Einsteinian proportions is made.
-------------Please rate this post if it was useful.
Thanks Hnefi, that was a very informative post. I dont understand the later part of this comment:

Using GA as a learning algorithm avoids the local minima problem almost completely, but you pay for it by throwing away all structural knowledge you have of the problem which manifests itself as extremely poor learning rates.

I undertand the avoiding of the local minima problem but what do you mean by saying you do so at the cost of throwing away all structural knowledge you have of the problem which manifests itself as extremely poor learning rates??
Feeling #0000FF
Quote:Original post by sion5
Thanks Hnefi, that was a very informative post. I dont understand the later part of this comment:

Using GA as a learning algorithm avoids the local minima problem almost completely, but you pay for it by throwing away all structural knowledge you have of the problem which manifests itself as extremely poor learning rates.

I undertand the avoiding of the local minima problem but what do you mean by saying you do so at the cost of throwing away all structural knowledge you have of the problem which manifests itself as extremely poor learning rates??

Structural knowledge is the knowledge you have about how to solve the problem. You can use that to make the search for a solution more efficient. A* in pathfinding is an example of this; you use the knowledge that distance is an optimistic approximation of node fitness in combination with the knowledge that an optimistic approximation is permissible for that particular algorithm to efficiently and correctly reach a solution.

Neural nets already throw away a lot of structural information, but not all. In the case of backpropagation, a typical example of this is to assign different learning rates to different parts of the network; often, you let each neuron keep track of a measure of "inertia" that makes the search for the optimum much more efficient. This "inertia" is obtained by measuring how much the neuron needed to adjust its weights in the past and let that affect the learning rate in the present. By doing so, you are using the knowledge that this particular neuron is close to or far from a stable optimum, so it makes sense to keep it more or less stable.

GA cannot do this. You cannot get a meaningful idea of what parts of the genomes should be kept stable, because recombining them may alter this completely, even to the point of inverting the fitness of the entire genome. You can make no guarantees that it is better or worse to swap one part of a genome with another. It's all random.

This is a problem with all statistical methods, but GA is one of the worst offenders. That's why it's so general and also so inefficient.
-------------Please rate this post if it was useful.
Please dont slate me on this as I have not done enough research into the implementation but ....

For my NN I was going to introduce the GA to establish the synaptic weights. The fitness of each weight would be determined by how close to the training data output the NN's output is. The chromosomes with the highest fitness "in theory" will be the weights that give the output we require.

If I haven't explained this well please let me know and I will try and elaborate further.
Feeling #0000FF
Quote:Original post by sion5
Please dont slate me on this as I have not done enough research into the implementation but ....

For my NN I was going to introduce the GA to establish the synaptic weights. The fitness of each weight would be determined by how close to the training data output the NN's output is. The chromosomes with the highest fitness "in theory" will be the weights that give the output we require.

If I haven't explained this well please let me know and I will try and elaborate further.

That is the standard way of doing it, and it works (but inefficiently). What this solution does is make a number of copies of the neural net, evaluate all of them, and then randomly (with a preference for more fit nets) choose nets/chromosomes to "mate". Which genes get switched between two chromosomes is random; you don't know how different parts of the chromosome affect other parts (and by extension, the fitness of the entire chromosome), so you can't make a reliable statement about which parts of the chromosome should remain stable.

It works, just don't expect it to learn anything significant in realtime.
-------------Please rate this post if it was useful.
I dont understand why you would want it to learn anything at runtime? You would have trained the NN offline. All you are doing is using it runtime.
Feeling #0000FF
In that case, I don't really see the point. Why not simply use decision trees instead?
-------------Please rate this post if it was useful.

This topic is closed to new replies.

Advertisement