Jump to content
  • Advertisement
Sign in to follow this  
CProgrammer

Has this ever been done? Image processing...

This topic is 3330 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi guys, does anyone know what the current research status is on taking camera input and removing any objects that dont move from the scene, i.e. if there is a motionless room removing everything but the person walking in it, keeping him/her when standing still offcourse. Anybody been working in this area or was confronted with it? -CProgrammer

Share this post


Link to post
Share on other sites
Advertisement
I'm not familiar with any particular work in this area, but that doesn't mean much as I've spent virtually zero time thinking about machine vision [wink]

Just removing non-motion objects is pretty trivial: compare two images; if a given pixel value in the first image is within a few percent of its value in the second image, then you mark the pixel as non-movement and draw it as black or whatever.

The tricky part would be to keep a person visible after they stop moving. This would require some actual object-recognition and AFAIK that technology is still fairly limited.

You might be able to get away with some kind of hack, though: if a pixel was marked as "moving" at any time in the last, say, 5 frames, and suddenly it goes to "non-moving", then you keep it visible for a while and assume that it was something moving that is now standing still. Resolution and accuracy may be problems, but there should be ways around that (reveal a circle around each displayed pixel instead of just that pixel; adjust the threshold function to handle noise in the video stream; etc. etc.).

Share this post


Link to post
Share on other sites
You can do as ApochPiQ stated, or you can compare each frame to the original frame without the person/actor, and remove all the similar pixels. This will retain the actor, whether they move or not; just don't let the actor wear anything colored similar to the background. This is the simplest way to perform digital/pixel green screening.

Share this post


Link to post
Share on other sites
Quote:
Original post by ApochPiQ
Just removing non-motion objects is pretty trivial: compare two images; if a given pixel value in the first image is within a few percent of its value in the second image, then you mark the pixel as non-movement and draw it as black or whatever.


This works only for trivial synthetic images. As soon as any kind of transformations are applied, it fails. Just the noise introduced by either RGB-YUV transformation, let alone video compression makes this next to impossible, since compression can distort individual values by 10% or more (8-bit channer will have variations of 20 or more, mostly due to YUV quantization of Cr and Cb channels).

Usually it's simpler to work in feature space. Extract some useful features such as edges, or corner points, and compare those. This way you can build a space determine by those features, which makes it simpler to compare them. For trivial algorithms, various statistical techniques are then used to determine changes.

Various MPEG algorithms tackle this problem in generic, but not necessarily semantically useful manner. You get either motion vectors or general transformation, but they cannot be directly classied.


Quote:
Resolution and accuracy may be problems, but there should be ways around that (reveal a circle around each displayed pixel instead of just that pixel; adjust the threshold function to handle noise in the video stream; etc. etc.).


I've tried doing correlation using RMS comparison by correlating sections of images, but for non-ideal streams it fails miserably. Another problem with real video is that you will need sub-pixel accuracy, or jitter becomes problematic.

Once you have that, you can work out afine transforms, and if you are daring enough, use them to reconstruct camera parameters (reverse view transform). But math here gets complicated fast, since solutions will not be accurate, and need to be fitted, they may also contain incorrect readings. But when it works, you get full "model" space representation of the area.

IMHO, the easiest way is to find some trivial convolution filter for edge detection, build reference points, correlate those and use them to split the image into polygons. Then do intra-polygon comparisons. This is quick and simple, nVidia was even demoing some GPU accelerate classification at some point, and IIRC, can be even done in real-time. This works better with video, since you can use optimistic algorithms, and expect that two frames will be very similar.

Once you have that, you can then start classifying different regions.

But it's been several years since I've dealt with the topic, so I imagine a lot of new things popped up. I actually used it for scene reconstruction - shoot the scene with camera, and out pops the 3D model. Last I checked on the area, a lot of progress has been made with respect to it as well.

Edit: Come to think of it, I last worked with this about 8 years ago. How time flies, most of the above is therefore very dated advice.

Share this post


Link to post
Share on other sites
Quote:
Original post by CProgrammer
does anyone know what the current research status is on taking camera input and removing any objects that dont move from the scene, i.e. if there is a motionless room removing everything but the person walking in it, keeping him/her when standing still offcourse.


I'm probably skirting the edge of NDA breakage, but yes this is doable with a live stream of data.

Granted, I can't say how, but consider this a push in the direction of 'it isn't impossible on hardware around today' [smile]

Share this post


Link to post
Share on other sites
There was a video of people using some sort of entropy encoding technology (same tech people use to encode video) to do video painting, resurfacing and erasing in realtime ( and it tracked movement too), that's probably the state of the art. I couldn't find the video again but its on youtube.

It relies on tracking pixels in higher dimensional space and an energy minimizing function of some sort, but I could be wrong, there was also a link to the paper.

Good Luck!

-ddn

Share this post


Link to post
Share on other sites
Quote:
Just removing non-motion objects is pretty trivial: compare two images; if a given pixel value in the first image is within a few percent of its value in the second image, then you mark the pixel as non-movement and draw it as black or whatever.


The first (reference) image needs to be an image without anything in it, (ie a blank wall). This way you always see the motion of the person.

Also, using the blank wall, you need to compare a series of images to find the noise level of the camera, and apply this value to ignore noise 'spikes'. Everything about the noise spike is probably a pixel of a moving object.

The change in light is a tough issue that you will have to deal with. For example, if the wall is near a window, the position of the sun could mess up the capture, as it's position changes, compared to the reference image. Same goes for lights turned on/off in the room.

Share this post


Link to post
Share on other sites
Quote:
Original post by cdoty
The first (reference) image needs to be an image without anything in it, (ie a blank wall). This way you always see the motion of the person.

Also, using the blank wall, you need to compare a series of images to find the noise level of the camera, and apply this value to ignore noise 'spikes'. Everything about the noise spike is probably a pixel of a moving object.

The change in light is a tough issue that you will have to deal with. For example, if the wall is near a window, the position of the sun could mess up the capture, as it's position changes, compared to the reference image. Same goes for lights turned on/off in the room.



I specifically avoided the idea of a static "empty" reference image precisely because it would be trivial to break the algorithm by simply changing the light levels in the scene.

However, as Antheus noted, my suggestion has plenty of other problems, so... [smile]

Share this post


Link to post
Share on other sites
Doesn't do the Xbox 360 game "You're in the movies" do something like that? I've never played it, but from what I've read it seems to do what you described.

Ok, that doesn't help you in any way when it comes to how to do it, but I thought I should mention it.

Share this post


Link to post
Share on other sites
scale the image down, apply a gaussian convolution map and compare pixel values. Try to track moving pixels, and calculate both their speed and their acceleration and have a 'calculated new position' for each group of pixel, so you can see if an object is moving.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!