Tracking Motion in Scene

Started by
8 comments, last by Black Knight 16 years, 2 months ago
Suppose I have a video. I want to track motion in it. How would I do this? I've been thinking about this for a while and I'm a bit stumped. For a little more background, lets say the video is a static scene, but there is an object that moves across the screen and I want to track that object's position. I was thinking that I could take the difference of the pixel in the previous frame and the current one and find the largest area of pixels with the largest difference to find the object. Then I though that this wouldn't work for some reason. Any insights? Thanks.
Advertisement
that sounds pretty much how i'd do it. are you trying to track any motion, or do you want to be able to isolate separate things moving in the scene (eg two people walking around different areas, and possibly crossing paths)? if it's just for running a security program off a webcam feed for example then i'd be taking a snapshot from the webcam every second or whatever's desirable and checking a spread of pixels against the last image, and not checking every single pixel as that's a bit overkill (say every 5th or 10th or whatever). you probably need to take into account slight colour variations so you have to decide on a tolerance level before it decides that there is in-fact something different in the scene than last time. then it just becomes a matter of expanding-out from the pixels that are different to find the entire are of difference if you are really interested in actually tracking what's changed, and not just saving an image for log purposes. if it is for security purposes then i'd probably take a snapshot, and also save another file with the same filename but different extension which you use to store your 'motion tracking' or rather 'difference tracking'... that way you can see a clean picture or you can have your program overlay colour or a bounding box or however you track the difference.

further to that you can also set-up areas of the scene to be ignored such as a blinking light or bugs around a light or a tv screen etc

give a bit more info on what your end-goal is, cuz it depends on what you are trying to accomplish as to how cool and complicated your code gets (e.g. controlling a turret and accurately predicting where to shoot to hit a moving target based on the video feed).
This is my term project at school.Well it was detecting pedestrians with neural nets in videos but i couldnt manage to do it with a neural network or detect any pedestrians :)
But I can detect motion.
I do it by extracting the background and then comparing it with the incoming frames from the camera.
To extract the background i compare to consecutive frames and look at the difference if the difference is very small then i add that pixel to the background if there is an intensity change i dont add that pixel.After sometime you get a background of the scene but you dont see moving objects in it.
Then i compare it with the actual frame to see the differences this marks the moving objects.

Here are some screenshots from my application :
Background



Detection(Background hasnt formed fully so detection is failing)


Detection(After the background is formed)



If a moving person or car whatever moves to a spot and waits long enough there then it will become part of the background and will no longer be detected.
Also the detection threshold needs tweaking if it is too small everything will be detected including leaves of trees and small objects.The block width height is the size of the detection rectangle which is scanned over the image.History comparison on the other hand is done per pixel.
Well basically thats it.Hope that helps a little.
I may post the source code of the application later its written with c#.It can work on a local webcam or a livecam(mjpeg)
Wow, thanks for that information. Black Night, your application does what I want to do but my situation is much simpler than a crowd of people. Lets say that I have a camera pointed at a single lane road and a car comes by every now and then. I want to track the car's position in the frame. Also, what did you use for video playback? What did you do to put those red outlines over the video?

Thanks for the replies.
my guess judging from the stuttgard banner at the top is that he is just pulling a webcam image from the interwebs. if you know how to write code to download a file from the net or have components that can do it for you then you are halfway there. aren't most webcam feeds like this just a static image that gets re-written each time the webcam takes a snapshot (and the webpage you normally view it on has a timer to reload the page)? it really depends on how you want to set-up your webcam (assuming that's what you are using) as to how you go about getting the image. if you set it up to just write a file periodically then all you have to do is load that file onto a canvas and then read the pixel data as normal, but if you set it up to stream video then it becomes a bit different and more black majik as you need to know more about how to access the webcam devices stream (and more stuff of which i have no info about). if you know how to do that then im fairly sure it's exactly the same method to get a tv tuner feed if you have a card installed on your system and want to play with that area too.
black knight : from re-reading what you wrote i get the impression that your program does NOT get affected by the sunlight changing? is that correct? is the intention of your program as it is to track someone/something that has come into the scene but possibly stopped moving? e.g. a person walks to the middle of the scene and gets tracked the entire time, but even if s/he stops moving completely in the middle of the scene they still get tracked for some time unless they stay motionless for a long enough period at which point they are considered a part of the scene/background?
My question would probably be more on what you used to display the video and read pixel values. Did you use built in C# classes, DirectShow...?
Ah... in Delphi a whole range of stuff has a TCanvas inherited from higher-level components which provides the ability to read / write individual pixels on said canvas. standard components like a TImage have one, the generic windows form with nothing on it etc has a Canvas property, and I am fairly sure a lot of the other generic components provide the ability to do your own drawing on top of the component. When it comes time to do things like access the webcam feed is where it starts to get interesting. i would imagine it is probably just a case of passing to a function related to the webcam device a reference to the canvas property of wherever is intended to receive the image, and when the webcam wants to draw a new frame it simply accesses the canvas you passed the pointer to and draws away not caring about anything other than it has somewhere to draw. but this is all delphi talk and somewhat guessing at-best. i think even a TPanel has a canvas property which i wouldn't have thought to exploit.

in terms of C++ I haven't the faintest idea. I don't even know if your IDE provides simple windows 'things' like buttons and stringgrids and checkbox's etc... but if it does look for something along the lines of an Image component, or just read-up on accessing the forms canvas itself. maybe c++ works completely different, i do not know.

-edit- <a href="http://www.relisoft.com/win32/canvas.html>http://www.relisoft.com/win32/canvas.html

ok whatever my edit won't show-up...

have a look here anyway http://www.relisoft.com/win32/canvas.html
nb : You are right if a person comes to the scene he is tracked but if he stays long enought motionless he will become part of the background and will not be detected.
The sun light changes are not detected because it changes gradually very slowly over time and they are always added to the background.

To get the camera images from the web i found a code to get images from a livecam.It s mjpeg camera class which grabs a stream from the web and looks for the end delimiter then copies the frame.There are different live cams around some of them send just static images some work on streams.mjpeg was a little more complex than static image senders.

This topic is closed to new replies.

Advertisement