Object recognition in games

Started by
6 comments, last by Gil Grissom 14 years, 10 months ago
I got a fun idea and I want to know how practical it is. http://img362.imageshack.us/img362/6781/hnecra521vr1.jpg In that image, if you're familiar with D2, you'll see a waypoint, stairs, a stash, a building and a torch--among other things. Is there any reasonable way to identify those objects with unsupervised neural networks, for example? Every example of neural networks I see uses very small amounts of training data and such, 16x16 for example, but I'll be working in 800x600. My idea is to take video with FRAPS and just run in a circle grabbing video of certain areas and use that to train a neural network. Hopefully it will pick out defining features without me having to define them. Also, use video of a similar but different area as "un-training" (I forget the term) the network to make it stronger. Then once I have many of these areas I can connect them together and form a whole layout of a small town, with user-defined walkable nodes connecting each area. Very basic idea right now, just sounds like it might be a fun dive into machine vision.
Advertisement
I'm not sure why you want to use an unsupervised neural network specifically. Could you elaborate on that? Also, consider simpler alternatives to neural networks, such as discriminant analysis or logistic regression. They often perform well and are typically easier to implement, both in training and in deployment.

As far as the number of input variables goes, I suppose the notion of "small" is in the eye of the beholder. A 16x16 region represents 256 variables (3 times that if RGB color channels are included separately).

It is nearly always far more effective to derive features from a region of that size than to simply dump in the raw pixel values. Many features have been devised for exactly this sort of problem: statistical summaries of the pixel values (mean, standard deviation, skewness, etc.), gradient measures, neighbor differences, etc. See my article, Pixel Classification Project, to get an overview of a representative color texture classification project.

Deciding which input variables to keep can be handled by a number of systematic approaches: forward and backward selection are popular in the statistical field, but I have found genetic algorithms to be very effective.

Using this general approach, you should be able to handle areas considerably larger than 16x16 pixels.


Good luck,
Will Dwinnell
Data Mining in MATLAB
Quote:I'm not sure why you want to use an unsupervised neural network specifically. Could you elaborate on that?
I'm new to this neural network stuff, but all my research has said unsupervised training will lead to "discovering underlying patterns" that I may not see. So I was hoping if I ran that, it would find some patterns that I never noticed, because right now I don't notice any.

Thank you very much for your reply, I'll dig into all this stuff and post an update when I've decided what's next :)

EDIT #1
It seems your pictures in that blog post are gone now, I'd really like to see them!

EDIT #2
After reading your post it seems like it's mostly color based, green leaves or white skin. However, you gave me some ideas, but I don't have enough knowledge/experience to apply them.

Using some of the metrics you mentioned (rgb color, hue, saturation, edges), could I really pick out objects? I understand how I could pick out leaves/skin in an image, but objects in D2 vary so widely -- mainly I just want to pickup landmarks in certain areas to know where I am.

I think I'm on the wrong track, I just need a system to find landmarks on the screen to coordinate the character around a town. There are torches, barrels, NPCS, etc. everywhere, I just need to isolate them from the background and record them.

[Edited by - Insolence on June 8, 2009 8:37:06 PM]
Quote:Original post by Insolence
Quote:I'm not sure why you want to use an unsupervised neural network specifically. Could you elaborate on that?
I'm new to this neural network stuff, but all my research has said unsupervised training will lead to "discovering underlying patterns" that I may not see. So I was hoping if I ran that, it would find some patterns that I never noticed, because right now I don't notice any.


To clarify: "unsupervised learning" refers to learning from a collection of attributes without reference to a "right answer". In contrast, "supervised learning" refers to learning to associate an appropriate response (an estimate or classification, for instance) with a set of attributes.

In your case, it is possible to provide the machine learning process with the correct class for the image region being viewed. You'd probably do best with some supervised learning technique- some neural networks do this, or you could consider any of a number of alternatives, such as discriminant analysis, decision tree induction, naive Bayes, SVM, etc. blah blah...


-Will Dwinnell
Data Mining in MATLAB
From all my research, I need to use SURF or SIFT to pickout some pre-trained images. Not exactly what I wanted, but if I can implement it I'll certainly be happy :)
Quote:Original post by Insolence
After reading your post it seems like it's mostly color based, green leaves or white skin. However, you gave me some ideas, but I don't have enough knowledge/experience to apply them.

Using some of the metrics you mentioned (rgb color, hue, saturation, edges), could I really pick out objects? I understand how I could pick out leaves/skin in an image, but objects in D2 vary so widely -- mainly I just want to pickup landmarks in certain areas to know where I am.


How well this works will naturally depend on a number of factors. The lower the number of classes of objects to be identified, the better. The more that the classes of objects vary in appearance, the better. I think this idea is worth a shot, and I imagine that dividing the image into regions (distinct rectangular cells, of perhaps 50 to 500 pixels) to be classified separately might work.

The basic process will be something like this:

1. Acquire training data from images: divide each image into cells, extract pixel data from same and assign to a known class (including "other" or "background")
2. Extract meaningful features from training examples (brightness, color, texture functions)
3. Train the model (neural network, discriminant, etc.) using a fraction of the data
4. Test the model on the remaining data

A key issue here is the creative development of features to use (step 2).


-Will Dwinnell
Data Mining in MATLAB
You would probably get results by doing alot of preprocessing on the image and simplifying the resolution fed to the neural net. If that picture is typical, color combinations and percentage of same/similar color coverage on each game grid 'tile' could be the major factors of identification. If you can align the inputs to the tile grid for sampling (should be doable on a game like that), you can process each tile indcependantly thru a tile recognition NN and then assemble all those to a terrain pattern NN (to guide movement etc...)
--------------------------------------------[size="1"]Ratings are Opinion, not Fact
I'm wondering what would be the application of this? A computer player would probably get object identity from the game engine, so there is no need in this kind of recognition.

Regarding specific implementation techniques, here are a few points:
* When you say you'll be using 800x600 images, I assume it means the entire screen. This is not what you want to feed the neural network. Instead, you want to cut out a small piece of the image (e.g. 16x16 or something similar) and feed that to the network. The network will then say something like "empty", or "building", or "torch", etc. So you repeat that for every window in the image and get a set of detected objects.
* You don't need to implement SIFT, there are implementations available (e.g. http://www.vlfeat.org/~vedaldi/code/siftpp.html).

This topic is closed to new replies.

Advertisement