Sony's E3 motion tracking

Started by
8 comments, last by swiftcoder 14 years, 10 months ago
I was quite impressed with Sony's motion tracking demo at E3 (especially interesting are the sword fight and archery sections). I would be very interested to figure out how they are doing this, and how hard it would be to replicate on a PC. As far as I know, the camera is pretty much a standard (visible spectrum) webcam. Since the 'wand' has a coloured ball on one end, tracking the wand in 2-dimensions is entirely trivial. I can also see how one could detect the 3D attitude/angle of the wand, by detecting the length and angle in 2D of the black handle, in relation to the coloured tip, and applying a little trigonometry. However, they definitely appear to be performing true 3D tracking, and given that they only use a single camera, I don't see how are they measuring depth. Is it possible that the coloured ball is clearly enough defined, that they can determine the visible radius, and use the camera's focal length to determine the distance? Or are there cleverer methods to accomplish this?

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

Advertisement
Quote:Original post by swiftcoder
I was quite impressed with Sony's motion tracking demo at E3 (especially interesting are the sword fight and archery sections). I would be very interested to figure out how they are doing this, and how hard it would be to replicate on a PC.
there are lot of tracking software on PC, I'm not that impressed, beside those funny presentation.

Quote:
As far as I know, the camera is pretty much a standard (visible spectrum) webcam. Since the 'wand' has a coloured ball on one end, tracking the wand in 2-dimensions is entirely trivial.
I agree.

Quote:
I can also see how one could detect the 3D attitude/angle of the wand, by detecting the length and angle in 2D of the black handle, in relation to the coloured tip, and applying a little trigonometry.
I disagre, I think they said the handle has some hardware to messure that, and that would be indeed the simplest and smartest/bufree solution, as they have those gyroscopes in the pads anyway. (edit: http://www.gametrailers.com/video/e3-09-playstation-3/50623 right at the begining he claims the device has internal motion sensors).

Quote:
However, they definitely appear to be performing true 3D tracking, and given that they only use a single camera, I don't see how are they measuring depth.

The playstation eye has "auto focus", so, they have software access to the lense, and just like every cam nowaday does, they can adjust it to keep the focus on the orange ball. from this it's probably damn easy to calculate (or lookup) the distance.

Quote:
Is it possible that the coloured ball is clearly enough defined, that they can determine the visible radius, and use the camera's focal length to determine the distance? Or are there cleverer methods to accomplish this?

although the cam seem to be quite good, i'm not sure if the resolution is high enough. even just for position tracking you usually have very high resolution cams (for mocap). calculating the radius would be possible, but I'm not sure that would be accurate enough.


but that's all just a guess :)

[Edited by - Krypt0n on June 5, 2009 5:13:41 AM]
Quote:...
Quote:
However, they definitely appear to be performing true 3D tracking, and given that they only use a single camera, I don't see how are they measuring depth.

The playstation eye has "auto focus", so, they have software access to the lense, and just like every cam nowaday does, they can adjust it to keep the focus on the orange ball. from this it's probably damn easy to calculate (or lookup) the distance.
(edit2: if you watch the 2nd stage, with the blocks, it seems to me like the blocks that he holds on the sticks are contantly moving slightly back&forth, although he also trys to move them that fast that you barely notice that. either that's a precision problem, or it's cause of the progressive lense adjustment.)
....
sorry, accidently pressed quote instead of edit :/
Quote:Original post by Krypt0n
Quote:Original post by swiftcoder
I was quite impressed with Sony's motion tracking demo at E3 (especially interesting are the sword fight and archery sections). I would be very interested to figure out how they are doing this, and how hard it would be to replicate on a PC.
there are lot of tracking software on PC, I'm not that impressed, beside those funny presentation.
I haven't seen much in the way of true 3D tracking on PCs, at least not outside of dedicated mocap setups.
Quote:
Quote:I can also see how one could detect the 3D attitude/angle of the wand, by detecting the length and angle in 2D of the black handle, in relation to the coloured tip, and applying a little trigonometry.
I disagre, I think they said the handle has some hardware to messure that, and that would be indeed the simplest and smartest/bufree solution, as they have those gyroscopes in the pads anyway. (edit: http://www.gametrailers.com/video/e3-09-playstation-3/50623 right at the begining he claims the device has internal motion sensors).
Good catch, I missed that part of the video the first time through. Yes, that does make sense.
Quote:
Quote:However, they definitely appear to be performing true 3D tracking, and given that they only use a single camera, I don't see how are they measuring depth.
The playstation eye has "auto focus", so, they have software access to the lense, and just like every cam nowaday does, they can adjust it to keep the focus on the orange ball. from this it's probably damn easy to calculate (or lookup) the distance.
That would have been my first guess as well, but it doesn't seem to have any trouble tracking both wands - not sure if the focus trick would work well enough if the wands are separated by more than a couple of feet, but I could be wrong.

The Playstation Eye also appears to be fixed focus, from all the information I can find (chiefly wikipedia).

edit: the newly-minted wikipedia page on the motion controller does describe the radius method, but I have no idea of the provenance of their information.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

Quote:Original post by swiftcoder

Quote:
Quote:However, they definitely appear to be performing true 3D tracking, and given that they only use a single camera, I don't see how are they measuring depth.
The playstation eye has "auto focus", so, they have software access to the lense, and just like every cam nowaday does, they can adjust it to keep the focus on the orange ball. from this it's probably damn easy to calculate (or lookup) the distance.
That would have been my first guess as well, but it doesn't seem to have any trouble tracking both wands - not sure if the focus trick would work well enough if the wands are separated by more than a couple of feet, but I could be wrong.
I didn't think about this, so yeah, the only thing left is probably calculating the area of orange pixels and projecting that to the distance.

Quote:
The Playstation Eye also appears to be fixed focus, from all the information I can find (chiefly wikipedia).

ok, that's pointless now (as it wouldn't work with two handles) but I looked up my info from
Quote:http://uk.ps3.ign.com/articles/822/822743p1.html
Aside from its microphone, the PlayStation Eye bests the EyeToy in that it has two levels of zoom and utilizes an auto-focus feature so that you don't have to manually twist the lens in order to get a clear picture


Quote:Original post by Krypt0n
Quote:Original post by swiftcoder
Quote:
Quote:However, they definitely appear to be performing true 3D tracking, and given that they only use a single camera, I don't see how are they measuring depth.
The playstation eye has "auto focus", so, they have software access to the lense, and just like every cam nowaday does, they can adjust it to keep the focus on the orange ball. from this it's probably damn easy to calculate (or lookup) the distance.
That would have been my first guess as well, but it doesn't seem to have any trouble tracking both wands - not sure if the focus trick would work well enough if the wands are separated by more than a couple of feet, but I could be wrong.
I didn't think about this, so yeah, the only thing left is probably calculating the area of orange pixels and projecting that to the distance.
I am thinking of mocking up a prototype with OpenCV and and a couple of coloured ping-pong balls, so the next question is whether there are any shortcuts we can take with the actual vision algorithm, given that we are looking for perfect spheres of known colour and diameter?

The obvious starting point is to use the provided blob-tracking, run some sort of histogram test to detect the correct colour, and some sort of heuristic to check the blob is (roughly) circular. I have a feeling though, that there may be a more efficient approach possible, given the pre-existing knowledge of the targets.

(keep in mind that while I have experimented with computer vision, I am pretty much a novice in this area)

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

Quote:Original post by swiftcoder
Quote:Original post by Krypt0n
Quote:Original post by swiftcoder
Quote:
Quote:However, they definitely appear to be performing true 3D tracking, and given that they only use a single camera, I don't see how are they measuring depth.
The playstation eye has "auto focus", so, they have software access to the lense, and just like every cam nowaday does, they can adjust it to keep the focus on the orange ball. from this it's probably damn easy to calculate (or lookup) the distance.
That would have been my first guess as well, but it doesn't seem to have any trouble tracking both wands - not sure if the focus trick would work well enough if the wands are separated by more than a couple of feet, but I could be wrong.
I didn't think about this, so yeah, the only thing left is probably calculating the area of orange pixels and projecting that to the distance.
I am thinking of mocking up a prototype with OpenCV and and a couple of coloured ping-pong balls, so the next question is whether there are any shortcuts we can take with the actual vision algorithm, given that we are looking for perfect spheres of known colour and diameter?
you know the last frame's position and distance, just like with motion estimation you can search nearby. the higher the framerate of the cam, the better the tracking will probably be (as the search area is determined by the distance moved, as this is increasing and the area scales with n*n).

Quote:
The obvious starting point is to use the provided blob-tracking, run some sort of histogram test to detect the correct colour, and some sort of heuristic to check the blob is (roughly) circular. I have a feeling though, that there may be a more efficient approach possible, given the pre-existing knowledge of the targets.

I think it's not really that hard, you can make
-one pass where you calculate the distance of all pixels to your reference color.
-one treshold pass
-one median filter pass to reduce noise
-find the bounding box of the current pixels

that shall work pretty fine, and those passes are kinda primitive and easy to implement.
the key is to have that color on screen that is different to every other color. if I were sony, I'd the ball changing colors with 60hz (just like mo-cap 'balls') and this way it would be easier to find this one.


that ms demonstration. do you notice he's weaering glasses and also a bright orange pullover with kinda some markers (black lines) near his hands and around his neck.
the girl later on always stands with her limps spreat, that's not that much different to eyetoy where they just make a diff against the background. but you have to keep moving at least a bit to not be marked as background.

i thing those techs are not that different.
Quote:Original post by swiftcoder
The obvious starting point is to use the provided blob-tracking, run some sort of histogram test to detect the correct colour, and some sort of heuristic to check the blob is (roughly) circular. I have a feeling though, that there may be a more efficient approach possible, given the pre-existing knowledge of the targets.

(keep in mind that while I have experimented with computer vision, I am pretty much a novice in this area)


You could also look at the Circular Hough Transform - it identifies circles in images provided you give it an approximate radius.
Thanks for those hints! I also ran across this interesting project while doing a little research.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

This topic is closed to new replies.

Advertisement