Sign in to follow this  
CProgrammer

Has this ever been done? Image processing...

Recommended Posts

Hi guys, does anyone know what the current research status is on taking camera input and removing any objects that dont move from the scene, i.e. if there is a motionless room removing everything but the person walking in it, keeping him/her when standing still offcourse. Anybody been working in this area or was confronted with it? -CProgrammer

Share this post


Link to post
Share on other sites
I'm not familiar with any particular work in this area, but that doesn't mean much as I've spent virtually zero time thinking about machine vision [wink]

Just removing non-motion objects is pretty trivial: compare two images; if a given pixel value in the first image is within a few percent of its value in the second image, then you mark the pixel as non-movement and draw it as black or whatever.

The tricky part would be to keep a person visible after they stop moving. This would require some actual object-recognition and AFAIK that technology is still fairly limited.

You might be able to get away with some kind of hack, though: if a pixel was marked as "moving" at any time in the last, say, 5 frames, and suddenly it goes to "non-moving", then you keep it visible for a while and assume that it was something moving that is now standing still. Resolution and accuracy may be problems, but there should be ways around that (reveal a circle around each displayed pixel instead of just that pixel; adjust the threshold function to handle noise in the video stream; etc. etc.).

Share this post


Link to post
Share on other sites
You can do as ApochPiQ stated, or you can compare each frame to the original frame without the person/actor, and remove all the similar pixels. This will retain the actor, whether they move or not; just don't let the actor wear anything colored similar to the background. This is the simplest way to perform digital/pixel green screening.

Share this post


Link to post
Share on other sites
Quote:
Original post by ApochPiQ
Just removing non-motion objects is pretty trivial: compare two images; if a given pixel value in the first image is within a few percent of its value in the second image, then you mark the pixel as non-movement and draw it as black or whatever.


This works only for trivial synthetic images. As soon as any kind of transformations are applied, it fails. Just the noise introduced by either RGB-YUV transformation, let alone video compression makes this next to impossible, since compression can distort individual values by 10% or more (8-bit channer will have variations of 20 or more, mostly due to YUV quantization of Cr and Cb channels).

Usually it's simpler to work in feature space. Extract some useful features such as edges, or corner points, and compare those. This way you can build a space determine by those features, which makes it simpler to compare them. For trivial algorithms, various statistical techniques are then used to determine changes.

Various MPEG algorithms tackle this problem in generic, but not necessarily semantically useful manner. You get either motion vectors or general transformation, but they cannot be directly classied.


Quote:
Resolution and accuracy may be problems, but there should be ways around that (reveal a circle around each displayed pixel instead of just that pixel; adjust the threshold function to handle noise in the video stream; etc. etc.).


I've tried doing correlation using RMS comparison by correlating sections of images, but for non-ideal streams it fails miserably. Another problem with real video is that you will need sub-pixel accuracy, or jitter becomes problematic.

Once you have that, you can work out afine transforms, and if you are daring enough, use them to reconstruct camera parameters (reverse view transform). But math here gets complicated fast, since solutions will not be accurate, and need to be fitted, they may also contain incorrect readings. But when it works, you get full "model" space representation of the area.

IMHO, the easiest way is to find some trivial convolution filter for edge detection, build reference points, correlate those and use them to split the image into polygons. Then do intra-polygon comparisons. This is quick and simple, nVidia was even demoing some GPU accelerate classification at some point, and IIRC, can be even done in real-time. This works better with video, since you can use optimistic algorithms, and expect that two frames will be very similar.

Once you have that, you can then start classifying different regions.

But it's been several years since I've dealt with the topic, so I imagine a lot of new things popped up. I actually used it for scene reconstruction - shoot the scene with camera, and out pops the 3D model. Last I checked on the area, a lot of progress has been made with respect to it as well.

Edit: Come to think of it, I last worked with this about 8 years ago. How time flies, most of the above is therefore very dated advice.

Share this post


Link to post
Share on other sites
Quote:
Original post by CProgrammer
does anyone know what the current research status is on taking camera input and removing any objects that dont move from the scene, i.e. if there is a motionless room removing everything but the person walking in it, keeping him/her when standing still offcourse.


I'm probably skirting the edge of NDA breakage, but yes this is doable with a live stream of data.

Granted, I can't say how, but consider this a push in the direction of 'it isn't impossible on hardware around today' [smile]

Share this post


Link to post
Share on other sites
There was a video of people using some sort of entropy encoding technology (same tech people use to encode video) to do video painting, resurfacing and erasing in realtime ( and it tracked movement too), that's probably the state of the art. I couldn't find the video again but its on youtube.

It relies on tracking pixels in higher dimensional space and an energy minimizing function of some sort, but I could be wrong, there was also a link to the paper.

Good Luck!

-ddn

Share this post


Link to post
Share on other sites
Quote:
Just removing non-motion objects is pretty trivial: compare two images; if a given pixel value in the first image is within a few percent of its value in the second image, then you mark the pixel as non-movement and draw it as black or whatever.


The first (reference) image needs to be an image without anything in it, (ie a blank wall). This way you always see the motion of the person.

Also, using the blank wall, you need to compare a series of images to find the noise level of the camera, and apply this value to ignore noise 'spikes'. Everything about the noise spike is probably a pixel of a moving object.

The change in light is a tough issue that you will have to deal with. For example, if the wall is near a window, the position of the sun could mess up the capture, as it's position changes, compared to the reference image. Same goes for lights turned on/off in the room.

Share this post


Link to post
Share on other sites
Quote:
Original post by cdoty
The first (reference) image needs to be an image without anything in it, (ie a blank wall). This way you always see the motion of the person.

Also, using the blank wall, you need to compare a series of images to find the noise level of the camera, and apply this value to ignore noise 'spikes'. Everything about the noise spike is probably a pixel of a moving object.

The change in light is a tough issue that you will have to deal with. For example, if the wall is near a window, the position of the sun could mess up the capture, as it's position changes, compared to the reference image. Same goes for lights turned on/off in the room.



I specifically avoided the idea of a static "empty" reference image precisely because it would be trivial to break the algorithm by simply changing the light levels in the scene.

However, as Antheus noted, my suggestion has plenty of other problems, so... [smile]

Share this post


Link to post
Share on other sites
Doesn't do the Xbox 360 game "You're in the movies" do something like that? I've never played it, but from what I've read it seems to do what you described.

Ok, that doesn't help you in any way when it comes to how to do it, but I thought I should mention it.

Share this post


Link to post
Share on other sites
scale the image down, apply a gaussian convolution map and compare pixel values. Try to track moving pixels, and calculate both their speed and their acceleration and have a 'calculated new position' for each group of pixel, so you can see if an object is moving.

Share this post


Link to post
Share on other sites
Quote:
Original post by cdoty
The first (reference) image needs to be an image without anything in it, (ie a blank wall). This way you always see the motion of the person.

Also, using the blank wall, you need to compare a series of images to find the noise level of the camera, and apply this value to ignore noise 'spikes'. Everything about the noise spike is probably a pixel of a moving object.

The change in light is a tough issue that you will have to deal with. For example, if the wall is near a window, the position of the sun could mess up the capture, as it's position changes, compared to the reference image. Same goes for lights turned on/off in the room.
Apple's iChat uses this technique to place custom backgrounds in video chats. It works fairly well, as long as your camera is solidly fixed in place (even gentle vibrations will mess up the image matching), and the lighting doesn't change. However, video chats tend to be short enough to side-step the light problem.

Share this post


Link to post
Share on other sites
Great replies, very inspiring, thanks.
It is a tough issue, mainly due to the noise problems that have been mentioned. Another one is the fact that the camera may readjust when someone comes closer, which changes the background. I doubt that a simple comparison of pixels between the live feed and a static image would give good results. Even when scaling and applying blur type filters because one would then lose detail and the silhouette would include parts of the background.
I will be checking out the iChat and XBox game which should shed some light on the current technological status, although a research paper may be more informative. I suppose there must be a reason why the film industry uses one colored screens, but the question is how much of a quality difference there is, movies want to get out the last bit of detail.

Share this post


Link to post
Share on other sites
Quote:
Original post by CProgrammer
I suppose there must be a reason why the film industry uses one colored screens, but the question is how much of a quality difference there is, movies want to get out the last bit of detail.
One reason may be legacy - the film/TV industry has been using bluescreen techniques just about forever.

Share this post


Link to post
Share on other sites
Hello, just a 'reverse view' on the matter.
In singling out an object/actor in an sequence of images, one can assume that the focus is on the object/actor at all times. Which makes him/her/it more noiseless/more constant than the background.
Like metioned, the background may become different due to focussing, but this focussing is intended to keep the object constant, which can be used.
Depending on how smooth you want the edges to be. this is basically an edge detection problem, with some added information in the form of multiple frames.
You'll most probably end up with some edge detection algortihm assisted by proper filters (simply blurring eg), treshhold-values etc.
This on evey frame seperatly + a evaluator to determine the equality of the fit between frames.

Share this post


Link to post
Share on other sites
maybe you can try this as a reference
http://ocw.mit.edu/OcwWeb/Electrical-Engineering-and-Computer-Science/6-801Fall-2004/CourseHome/index.htm

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this