Kinect Programming: Foreword
I have been planning for some time to do some writing about my adventures working with the Kinect. It has been a long time coming, but I have finally been able to start making some progress with this effort. While I haven't completed enough work for a full article, I thought I would display some initial results and put out some of my comments about what I am seeing and where I will be heading. So in book terms, this would be a foreword...
Last time around, I mentioned some of my motivations for doing some Kinect programming. I refrain from the term 'hacking', as it seems that all mainstream media think that when you develop against the Kinect that you are a 'hacker' - I much prefer the term developer, or if we take ourselves very seriously then perhaps a computer scientist... In either case, among developers being a 'hack' is not a compliment. There is lots and lots of math, and many complicated algorithms that go into computer vision (which I consider Kinect a part of), so please let your fellow man know that we aren't all hacks!
Anyways, I had managed to get both the color and depth buffers read into D3D11 resources in my first effort. With this accomplished, I moved on to using those resources to reconstruct the 3D scene that was seen in the depth image. My first shot at this simply treated the depth image as a height field, and I rendered the results as a wireframe height field. However, the sensor itself is actually set up to provide a fairly large depth range (4-12 feet) so it was fairly quickly obvious that the wireframe mesh could only display the rough shapes that are present.
After realizing this was the case, I switched my visualization over to a solid fill mode. This doesn't work on its own, since there isn't enough contrast between neighboring values. So I decided to generate normal vectors based on the depth data, and then do a false lighting calculation to indicate where the surface shape is varying. Here are some of the results:
There is some pretty interesting information to be seen in these images. Here is a quick roundup of what I noticed:
- When the depth image pixels remain valid (instead of flickering to no-reading) they are remarkably steady. The areas of the scene that don't move and are suitable materials are rock solid.
- Some materials are not suitable for use with the Kinect - take a look at the metallic door handle in the color image, then see if you can see it in the depth or 3D reconstruction... Also the window produces a strange effect, which patches of it producing invalid data... Glass and shiny metal are not good candidates!
- I was also surprised at the precision of the output depth. There is lots of detailed information in that reconstruction...
These results provide a fairly good basis from which I can build on in the future. The current rendering techniques are all performed live, and hence don't cache any information or build any models of the scene. I'm planning to move the processing of the data to the compute shader, where I will be able to do more sophisticated computing. The results of this processing will then be able to be rendered after the fact or stored for future frames / stored to disk. The coming weeks will be quite interesting indeed, and I look forward to sharing more progress as it comes along.