Started by May 02 2006 06:19 AM

,
12 replies to this topic

Posted 02 May 2006 - 06:19 AM

Hello everybody,
Recently a good friend of mine has asked me about a quite interesting problem. I had no clue about the solution, so I decided to ask somebody who is more educated in this way :)
His question was, if it is possible to somehow reconstruct a 3D model(only points) from a picture or a series of pictures(every frame shot from a different angle) of human faces. The pictures would be a normal 32-bit pictures in high resolution. By the way, would a different way of capturing help to solve this problem?
Have somebody any idea if this can be solved somehow? If yes, how?
Thank a lot for our answers

Posted 02 May 2006 - 06:22 AM

Yes. It is a very difficult problem, one requiring immense amounts of math. Here is the book about how to do it. (Here's another one.)

Posted 02 May 2006 - 06:31 AM

Quote:

Original post by Sneftel

Yes. It is a very difficult problem, one requiring immense amounts of math. Here is the book about how to do it. (Here's another one.)

How difficult this is really depends on what your assumptions are. I did this in college during a one semester course in computer vision. If you want you could create a 3D model, take two shots and try and reconstruct it. It would teach you the basics and give you a very controlable sand box. I wouldn't start with two real world pictures...

Posted 02 May 2006 - 07:33 PM

There are essentially two competing methods for solving this problem.

The first is to have an*a priori* parametric model of the thing you expect to see in the image and then to find the best set of parameters for the model that explains the data you see. This method is known by a variety of names, but if you look for 'image understanding', you'll find information on this method.

The second method is to infer the 3D position of each point on the surface of the object in the 2D image from the seqence of images and then construct a 2D surface embedded in 3 space from those inferred points. This is the method of surface reconstruction.

The results of the latter are generally less optimal, since they are sensitive to noise in the 2D image. However, the latter method makes no assumption as to the underlying object structure in the image, thus making it more widely applicable.

The actual method you choose would largely depend on what you want to do with the reconstructed 3D object. If you were, for example, trying to do facial recognition, then the first method would give better results than the second since you can perform the classification in the model parameter space (Of course, there are good ways of doing facial classification using just the 2D image). If you simply wanted to create a 3D 'head' of someones face as seen in a camera, so you could display it on a computer screen, then the latter method would be sufficient.

There is a huge amount of literature out there on this problem so you shouldn't be left in the dark if you choose to follow up on the solution methods.

Cheers,

Timkin

The first is to have an

The second method is to infer the 3D position of each point on the surface of the object in the 2D image from the seqence of images and then construct a 2D surface embedded in 3 space from those inferred points. This is the method of surface reconstruction.

The results of the latter are generally less optimal, since they are sensitive to noise in the 2D image. However, the latter method makes no assumption as to the underlying object structure in the image, thus making it more widely applicable.

The actual method you choose would largely depend on what you want to do with the reconstructed 3D object. If you were, for example, trying to do facial recognition, then the first method would give better results than the second since you can perform the classification in the model parameter space (Of course, there are good ways of doing facial classification using just the 2D image). If you simply wanted to create a 3D 'head' of someones face as seen in a camera, so you could display it on a computer screen, then the latter method would be sufficient.

There is a huge amount of literature out there on this problem so you shouldn't be left in the dark if you choose to follow up on the solution methods.

Cheers,

Timkin

Posted 03 May 2006 - 04:49 AM

I've a friend who is specifically working in this domain. The results of his team are pretty amazing. I'll try to get in touch with him and show him this thread.

Posted 03 May 2006 - 07:46 AM

I work& do research in this domain, quite interesting :)

From one picture, the only serious class of method I can think of is called "Shape from shading". You use the variation in shadows on the face to compute the shape...

For two or more images, if you know the exact change in camera position and orientation between your shots, you can use a simple technique generally called "stereo". Basically, you find a per-pixel correspondance between a pair of images. Knowing the camera motion parameters, the magnitude of each pixel displacement will be proportional to the distance of that point from the camera.

Of course, this suppose that the images are taken simultaneously. If they are not, even the smallest motion of the person will screw your result. You also need a high precision and image resolution.

If you generalize this class of technique, it is called "shape from motion".

You can PM me if you want the equations...

From one picture, the only serious class of method I can think of is called "Shape from shading". You use the variation in shadows on the face to compute the shape...

For two or more images, if you know the exact change in camera position and orientation between your shots, you can use a simple technique generally called "stereo". Basically, you find a per-pixel correspondance between a pair of images. Knowing the camera motion parameters, the magnitude of each pixel displacement will be proportional to the distance of that point from the camera.

Of course, this suppose that the images are taken simultaneously. If they are not, even the smallest motion of the person will screw your result. You also need a high precision and image resolution.

If you generalize this class of technique, it is called "shape from motion".

You can PM me if you want the equations...

Posted 05 May 2006 - 03:19 AM

Check out my website, under Projects. I have a few screen shots from an application I wrote which extracts 3D information from image pairs.

http://www.nentari.com/stereo_image_processing.htm

and again at

http://www.nentari.com/3d_shadow_scanner.htm

Both pages are a work in progress, but I promise I'll update them with more information one of these days. :)

Essentially, what Steadtler said, only I've never done anything quite so precise as knowing the cameras position. :)

Will

http://www.nentari.com/stereo_image_processing.htm

and again at

http://www.nentari.com/3d_shadow_scanner.htm

Both pages are a work in progress, but I promise I'll update them with more information one of these days. :)

Essentially, what Steadtler said, only I've never done anything quite so precise as knowing the cameras position. :)

Will

------------------http://www.nentari.com

Posted 05 May 2006 - 03:50 AM

Nice work, Rpgeezus.

I see that for stereo you use camera at different angles, the problem is MUCH simpler if you use cameras with parralel optical axis. Camera rotation yield not information about depth anyway.

edit: knowing the camera relative positions is not so important... you can still get results up to a constant factor (as you have done, I presume)

About that shape from shadow on your site, (Bouguet and Perona), thats old stuff. If you want something much more evolved, check out moire patterns. Instead of projecting a band of shadow, you can project one or several grid patterns on the object. I am currently measuring objects with precision in the micron range... thats 1/1000 of a millimeter :P

I see that for stereo you use camera at different angles, the problem is MUCH simpler if you use cameras with parralel optical axis. Camera rotation yield not information about depth anyway.

edit: knowing the camera relative positions is not so important... you can still get results up to a constant factor (as you have done, I presume)

About that shape from shadow on your site, (Bouguet and Perona), thats old stuff. If you want something much more evolved, check out moire patterns. Instead of projecting a band of shadow, you can project one or several grid patterns on the object. I am currently measuring objects with precision in the micron range... thats 1/1000 of a millimeter :P

Posted 05 May 2006 - 02:37 PM

You can find a lot of information about 3D reconstruction from 2D images on the robotvis project website :

http://www-sop.inria.fr/robotvis/

Unfortunately, the project has ended, but there is still a lot of valuable information on their pages.

About what you're specifically looking for (3D reconstruction of human faces), have a look at this page :

http://www-sop.inria.fr/robotvis/demo/diffprop/

More recent works about this subject are available there :

http://www-rocq.inria.fr/~gouet/Recherche/These/recherche_anglais.html

http://www-sop.inria.fr/robotvis/

Unfortunately, the project has ended, but there is still a lot of valuable information on their pages.

About what you're specifically looking for (3D reconstruction of human faces), have a look at this page :

http://www-sop.inria.fr/robotvis/demo/diffprop/

More recent works about this subject are available there :

http://www-rocq.inria.fr/~gouet/Recherche/These/recherche_anglais.html

Posted 12 May 2006 - 08:07 PM

I'm working in the related area - augmented reality - 3d registration of the marker (though there are also markerless registration methods).

What you are looking for is a very common task in 3d registration. It's not very difficalt mathematically - require some staff from linear algebra - understanding eigenvalues and egenvectors and, no more.

http://en.wikipedia.org/wiki/Eigenvalue

http://homepages.inf.ed.ac.uk/cgi/rbf/CVONLINE/entries.pl?TAG54

However process itself not quite simple.

3d reconsruction form multiple (usually two) pictures usually go like this:

identify feature points (points of interes) on the pictures , and finde correspondent points on both picture.

http://homepages.inf.ed.ac.uk/rbf/CVonline/feature.htm

Build fundamental matrics

http://homepages.inf.ed.ac.uk/cgi/rbf/CVONLINE/entries.pl?TAG82

solve epipolar constarin

http://homepages.inf.ed.ac.uk/cgi/rbf/CVONLINE/entries.pl?TAG91

with sngular value decompositon, throwing away zero eigenvalues.

http://homepages.inf.ed.ac.uk/cgi/rbf/CVONLINE/entries.pl?TAG61

and you have your 3d reconsruction - complete model, up to some error.

Those are basic steps. You can google each term - there is a lot of articles on the web.

There are also others, more arcane methods, but this is kind of standart method.

What you are looking for is a very common task in 3d registration. It's not very difficalt mathematically - require some staff from linear algebra - understanding eigenvalues and egenvectors and, no more.

http://en.wikipedia.org/wiki/Eigenvalue

http://homepages.inf.ed.ac.uk/cgi/rbf/CVONLINE/entries.pl?TAG54

However process itself not quite simple.

3d reconsruction form multiple (usually two) pictures usually go like this:

identify feature points (points of interes) on the pictures , and finde correspondent points on both picture.

http://homepages.inf.ed.ac.uk/rbf/CVonline/feature.htm

Build fundamental matrics

http://homepages.inf.ed.ac.uk/cgi/rbf/CVONLINE/entries.pl?TAG82

solve epipolar constarin

http://homepages.inf.ed.ac.uk/cgi/rbf/CVONLINE/entries.pl?TAG91

with sngular value decompositon, throwing away zero eigenvalues.

http://homepages.inf.ed.ac.uk/cgi/rbf/CVONLINE/entries.pl?TAG61

and you have your 3d reconsruction - complete model, up to some error.

Those are basic steps. You can google each term - there is a lot of articles on the web.

There are also others, more arcane methods, but this is kind of standart method.

Posted 13 May 2006 - 02:11 PM

Lol, what a coincedence. One of my new friends actually did a project on this. He used Tensor Algebra and Differential Geometry to create meshes based on images of faces. It is correct that his project requires an immense amount of mathematics, and he is extremely brilliant in mathematics. He won big at the Intel International Science and Engineering Fair recently. He has a paper on what he did... maybe it will be of interest to you? I'll try to find it.

Posted 13 May 2006 - 06:12 PM

Quote:

Original post by Sagar_Indurkhya

Lol, what a coincedence. One of my new friends actually did a project on this. He used Tensor Algebra and Differential Geometry to create meshes based on images of faces. It is correct that his project requires an immense amount of mathematics, and he is extremely brilliant in mathematics. He won big at the Intel International Science and Engineering Fair recently. He has a paper on what he did... maybe it will be of interest to you? I'll try to find it.

Hey, I'm interested too :)

Posted 14 May 2006 - 02:50 PM

Quote:

Original post by Sagar_Indurkhya

He used Tensor Algebra and Differential Geometry to create meshes based on images of faces.

I've got a paper (forgot who authored it though) on this type of approach applied to brain models and MRI registration problem. I could hunt for it if anyone is particularly interested.