Jump to content
  • Advertisement
Sign in to follow this  
choffstein

Object Recognition

This topic is 4187 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I have been trying to solve an issue of object recognition within my puny, naive brain for a few days now. My goal is to be able to identify 20 or so different objects within images -- and once the object is identified, identify a more specific object in the subgroup based on coloration, decals, et cetera. The majority of these objects are 'similar,' but have certain key features which distinguish them. They exist in a 3d environment (projected 2d), and therefore can be rotated in all directions as well as scaled and obscured (an object may exist in front of them). Basically, I think I should start by breaking down what I think 'identifies' an object: color, shape, texture, and 'key features'. These are fairly self-explanatory -- color identifies an objects color. Shape defines the general outline of an object. Texture basically takes away the color and gray scales the object to find contrasts to identify major textural differences. Finally, 'key identifiers/features' like decals are used to sub identify certain objects. But then things seem to get really inefficient when it comes to teaching the system. Because these objects can occur at almost any orientation, it seems almost impossible to identify objects with any real efficiency -- you would need a chain of images taken at all orientations and then use some sort of similarity scale to get a 'best' fit. If you have several hundred objects, each taken at several hundred orientations -- well, the system doesn't scale too well. Now, ignoring the above, my plan was basically for each orientation to create a basic 1000x1000 canvas, and continually feed the system images of an object at different scales (but the same orientation). I would scale images until the object was 1000x1000, then overlay the new image on the old canvas, using averaging algorithms to find similarities. 'Similar' areas would create a darker pixel, while dissimilar areas would lighten the pixels. At the end of the day, you would have a very blurry gray-scale identifier for the object. Texture would be a similar process, but use color contrast techniques to pull textures out more strongly. Color would simply be identified by a chain of 'similar colors' identified by the same keyword. When it came to searching by color, it would be a 'closest match' sort of deal. Key Features is a bit more complex, but I figured I could write an algorithm to identify areas of unique dissimilarity -- different colors than the primary ones used on the object. These would be saved individually. But then the whole thing sort of falls apart when it comes to the fact that the objects can be obscured and oriented in any direction. It seems like I would need to feed the system, quite literally, millions of photos. Also, how do identify within an image? For example, the object could be located anywhere within an image of any size -- at any scale and orientation. Would I have to check absolutely every pixel location? I can't seem to make the mental leap from ASCII recognition to 3d object recognition... My system doesn't seem realistic for facial recognition technology, which would be far more complex then what I would want. I want to be able to identify things like backpacks and chairs -- all with pretty unique signatures. Any ideas? I put this under AI because I believe learning techniques will definitely be needed. Thanks!

Share this post


Link to post
Share on other sites
Advertisement
I knew it wouldn't be as easy as I wanted. Thank you sir! I will be sure to check it out. All other opinions are welcome!

Share this post


Link to post
Share on other sites
The key term you're looking for in your literature search is image understanding. Many computer vision resources spend too much time worrying about image capture and image processing and little on image understanding. Object classification from visual images falls into the last topic.

Share this post


Link to post
Share on other sites
Here's something you'll want to keep an eye on: http://www.spectrum.ieee.org/apr07/4982

Although I don't think it's able to do what _you_ want it to yet (and I doubt anything else can come close), it probably won't be too long before it is able to do it, and more besides.

Hawkins also published a book in 2004 entitled "On Intelligence", which described the theory in detail.

It's a very exciting trail that Numenta's blazing.

[Edited by - Kring on March 31, 2007 4:53:39 PM]

Share this post


Link to post
Share on other sites
For some reason that article is no longer available, so here's another one that's almost as good: http://www.wired.com/wired/archive/15.03/hawkins.html

Hopefully this one doesn't disappear.

[Edited by - Kring on April 1, 2007 11:07:28 PM]

Share this post


Link to post
Share on other sites
I definitely had the feeling of "heard it all before" when I read that article about Hawkins' company and research. Not just the hype, but the technology as well. It's basically a heirarchical mixture of pattern classifiers with bidirectional weight updates. Each node essentially learns a Markov field model related to the set of 4x4 patterns it has been trained on, with dependency on the nodes above and below it in the heirarchy. The contribution of Hawkins will not be the technology, but rather an efficient implementation for training and use.

As for Hawkins' statement that the cortex implements the "heirarchical temporal memory" is pure conjecture. His statement that the proof of this conjecture is in the doing is logically absurd. I solved problem X using method A. That system solved problem X as well, hence it must also use method A.

Anyway, if he makes more millions from this, it only reiterates that most of the world don't understand science and technology, but are only too willing to consume it when it's sold with the right packaging!

Share this post


Link to post
Share on other sites
I'm quite new to all this HTM stuff myself and my computer doesn't have enough RAM to handle their Research Release, but from what I've read sofar it looks quite promising.
Only time will tell how it turns out, I suppose.

Share this post


Link to post
Share on other sites
I've skimmed through a few papers on object classifcation in my research, although I'm afraid I'm not that expert in the processes involved for your particular problem. My understanding is that most approaches in object classification use some form of feature detector algorithm to form an abstract representation of the image; the most common one in use today would be SIFT, or the Scale-invariant feature transform. You would then build a representation of the object in training on its own to construct of a database of the set of SIFT descriptors. Identification of the object would occur if you could find a minimum number of the descriptors in an acceptable configuration (i.e. similar to the object); this would help with occulsions and error as you don't need to match the entire set. I'm not sure how they deal with multiple orientations; I assume they just build a database of the object in a series of orientations.

It's an active area of computer vision research, so there's plenty of papers being published in this area if you want to look at what the academics are doing. The main conferences in computer vision are ICCV (International Conference on Computer Vision), CVPR (International Conference on Computer Vision and Pattern Recognition; traditionally a bit more practically oriented than some of the others), ECCV (European Conference on Computer Vision), as well as ACCV (Asian Conference onf Computer Vision) and ICPR (International Conference on Pattern Recognition).

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!