Kinect usermap smoothing using hq4x

Started by
4 comments, last by Syranide 11 years, 8 months ago
Hi, right now i am doing a post-processing of kinect's depth map. This is, of course, for gaming purpose so i need everything runs in realtime.
The game itself will be in HD and as you can guess, the problem is kinect's depth map has row resolution and it wouldn't be good. So i decided to use hq4x to smooth the kinect's depth map since this filtering is very good on binary image.

Here's my steps:

  1. My depth map resolution is only 320x240 (it's a sufficient resolution for gesture tracking, higher resolution may cause performance problem)
  2. Segment user's body only (this is very easy using user mask from OpenNI)
  3. I am using texture buffer and for speed consideration, the buffer should be power-of-two which is 512x512 (i don't want to cut my depth map so 512 is decided as the higher-nearest power-of-two).
  4. Then the 320x240 depth image is downsampled to 80x60 so when it's applied on hq4x the resolution is back to 320x240 and it's fit to 512x512 texture buffer
  5. Apply hq2x upsampling on 80x60 image

Here's my input:
in.png

And this is my output:
out.png

As you can see, hq4x does pretty well and the result is very good but there's an issue, the resulted image still has sharp curvatures on the edge.

What i want look like this:
goal.png

Currently i am stuck with this problem and i'm still thinking how to improve the result. Actually, I have an idea to do morphological erosion (or dilation) first, anyway i need to do research again :)

Perhaps anyone here has another idea :)

Thanks !
Advertisement
I'm not all that familiar with the hqx upsamplers, but from my limited reading it seems like it's simply just a matter of changing the interpolation tables, it's obvious that the default implementation prefers to keep sharp features where possible which is something you don't seem to want. If it's possible or how easy it is to change the interpolation tables to prefer smoothness and also have the intended result I have no idea, but seems like that's your major issue (you are not using it for pixelart upscaling as intended for).

Otherwise, depending on what kind of quality you want, nearest neighbour upscaling and blurring and then using a threshold to give a black/white image yields quite similar results, although the output is obviously a lot more "round", the following is a quick and dirty test in photoshop with 4x gaussian blur (if you upsample with bilinear rather than nearest as I did you get slightly better and less wobbly results).

67653822.png


The technique in this paper might be of use to you: http://research.micr.../kopf/pixelart/
Here's the comparison with hq4x: http://research.micr...rison_hq4x.html
However, it's pretty expensive...

I like the blur and threshold idea from Syranide; it would be very efficient. You could even apply an anti-aliasing filter such as FXAA before the upsampling step to get smoother results.

The technique in this paper might be of use to you: http://research.micr.../kopf/pixelart/
Here's the comparison with hq4x: http://research.micr...rison_hq4x.html
However, it's pretty expensive...

I like the blur and threshold idea from Syranide; it would be very efficient. You could even apply an anti-aliasing filter such as FXAA before the upsampling step to get smoother results.


I actually thought about that specific microsoft article too, but I imagine that it would be too "jittery/quirky/erratic" for real-time use as even tiny variations could introduce major changes in the output I think (it is mind numbingly cool though!). If I'm not mistaken, I think they even mention that it has some issues with animations somewhere, but perhaps I'm mistaken.

This is not my area at all, but to me it seems like some kind of "blurring algorithm" needs to be used to keep it fluid and consistent between frames, anything that too intelligently decides "on individual pixels" seems like it would just cause erratic behavior in realtime.

Running FXAA before the upsampling actually seems like a really good idea I have to say, if it works well it would remove the "wobbly and jagged" look and could actually end up looking really good...

I was going to suggest some basic algorithm for just filling various edges and gaps with grey pixels as a way to smooth out the original image just to minimze the wobbly look, but it seems FXAA should just be better in every way.


thanks for the replies since it's almost 2 weeks and nobody replied.

i tried hqx but it seems not very good while the image has noise, and so microsoft paper it's not real-time

Anyway, i (almost) solved the problem by using combination of edge detection and spline curve:
here's the result:

img_962.png

The result is really good but the fps dropped to 20-30 fps. The problem is i need a method that runs fast enough in real-time

Another idea is using morphological opening (still thinking how to implement this on shader):

img_179.png
but this one only removes a little noise around the edge, though.

Actually i don't need super-noise-removal algorithm, what i want is just approximation of the shape and loss of detail on several parts is fine to me since the goal is making a silhouette like this one:

silhouette.jpg
That's weird, I found your topic being second from the top... must've accidentally been on another page.

Anyway, did a quick photoshop blur again on your "original" image with a radius of 2 and it turned out pretty good I think, a lot better than your second I'd say, which seems to remove a lot of features while not really fixing the jaggedness.

Untitled_4awdawd.png

Again, I'm not really read up on this, but to me it seems that involving any significant decision making into the processing would ruin the realtime quality, making features appear/disappear and behave erratically, whereas blurring and such solutions would have a more consistent and fluid look (although not as high quality when looking at individual frames), and you could also get the result cheaply anti-aliased that way (if not using FXAA).


Just read your update, if you want that "continous shape" looking look, it seems to me like you have to give up the smaller features entirely and just fit some very rough curves over it all (but that could probably make it very "blobby" instead I think if you don't tune it carefully), or possibly just more blur. I'm curious though, I would think that from looking at that image that they don't use the buffer itself, but rather interpret the position of the body parts and then render a human model instead.


This topic is closed to new replies.

Advertisement