Jump to content
  • Advertisement

Archived

This topic is now archived and is closed to further replies.

CrunchDown

Robotic Imagery

This topic is 5418 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Does anybody know of a program, or library of some sort, where I can scan in real time an image coming from a web cam? Is there some way to interface the TWAIN driver directly into memory for analyzation by my imagery program? I''m developping a program which, hopefully, can convert the 2d image of a 3d scene into 3d objects in the computer''s brain(3d world) using motion for navigation by the robot.

Share this post


Link to post
Share on other sites
Advertisement
quote:
Original post by strider44
just a comment, you''ll need two cameras which are split by a known offset to get any sort of 3d derivation.


Not true. You can simply move one camera to achieve this. However, even then, you''ll just get a bad-looking heightmap from this method.

Share this post


Link to post
Share on other sites
Assuming you''re using Windows you can use DirectShow to capture from the webcam, but''s its a real pain to do.
You could also take a look at the ARToolKit source code. Grab the version 2.52 downloads that use DirectShow and Video For Windows. There''s also a version for Linux.

Enigma

Share this post


Link to post
Share on other sites
Yes, though I respect everyone''s ideas and give them thought, to me it is out of the question to have two cameras for the job.

The human brain doesn''t need two eyes, try for yourself. Because of this I don''t believe any computer imaging system should require two offset cameras because if it does, it uses a different method then our mind does.

The philosophy behind my system is this example: (and I can elaborate hugely if you guys are interested) is that if there''s a ZIP file on your computer, the zip file does not jump up and get running when you click on it, rather, Windows compares the extension with that of all known file extensions. It compares only what it knows to what it sees. My imaging tech will only be able to distinguish what it''s looking at unless it has a predetermined model to compare. The rest is just filling holes.

The short and long is, computers aren''t human and they cannot simply look at an object and make sense of it. They have to try fit their own idea of what it is, and what the computer thinks it should be, on to it. This idea works its way down through the system which I''m abstracting on right now, and the way I work it we should have already established alot of the technology to do this. In fact, all the techniques can be built right ontop of alot of the functions available in OpenGL.(I''m not a DX fan, I''ve only learned GL and it seems to do the job)

The goal of my research is to build a cheap, fast, independent program that needs no more than one camera to understand it''s surroundings and, with a glance, and a little process comparison, it can construct in it''s memory a 3d world which reflects closely the real world around it.

This kind of technology would enable us to build robots that we have always wanted. If you guys want to discuss this, I''m all up for it, I just hope we''re in the right forum!

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Oh, and did I forget to say I think the system should be able to look at any scene we are exposed to and resolve it. Not just a dinky little red ball finder, no, this vision system would be told to find a path of adequate size in front, or a machine or object of a certain size.

I''m working on some concept photos right now of the imaging process which I hope will help project my ideas better and so everybody can get a clear idea(I visualize everything) and challenge me that better.

With that said, rip my little theory apart.

Share this post


Link to post
Share on other sites
Have a look at DirectShow, that should do you for the image capture end of things, not the fastest but it works under windows. Under linux you can use Video 4 Linux.

The image processing end of things has been done before so my suggestion is to use a 3rd party library like OpenCV to give you the best shot at getting finished in a resonable time. It comes with code and samples that deal with stereo vision, and some sophisticated tracking systems (Kalman and HMM).

You might want to look into some of the papers around from Robocup (the robot soccer competition). They get the sony aibo dogs to do some really amazing 3d localisation with very minimal resources.

[edit]typo[/edit]

[edited by - XXX_Andrew_XXX on February 10, 2004 11:40:23 PM]

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Thanks. I''ll be sure to check that out

Share this post


Link to post
Share on other sites
no, you are incorrect when you say our human brain can do it with only 1 eye. You already have memories in your mind stored up of how objects look in 3 dimensions & use that data to make sense of what you see, not even decode it, because you really can''t tell, you just make sense of it. I had a friend who only had 1 eye, & believe me... sometimes he couldn''t tell if a moth was fluttering 3 feet away or a bird was flying 3 dozen feet away. That is just a simple example... lets not even get into the details of all the objects, shapes, colors, light intensities, blah blah blah that might be present in a scene for a camera to decode.

Share this post


Link to post
Share on other sites

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!