Single Pass DPMs under DX9 - Results and Artifacts

Started by
6 comments, last by patw 15 years, 4 months ago
I put some more time into doing Dual-Paraboloid maps in a single scene pass under DirectX 9. This is built on the work of Jason Zink (http://www.gamedev.net/columns/hardcore/dualparaboloid/) Here is the idea: Use the vertex shader to transform verts into paraboloid space, and then assign the vertex to the 'front' or 'back' area of the map. Both sides of the map are in the same texture, side-by-side. Vertex shader:

   float L = length(OUT.hpos.xyz);
   
   // Flip the z in the back case
   bool isBack = (OUT.hpos.z < 0.0);
   
   OUT.isBack = (isBack ? -1.0 : 1.0);
   if(isBack)
      OUT.hpos.z = -OUT.hpos.z;
   
   OUT.hpos /= L;
The vertex will be assigned to the 'back' map if it is behind the camera. The value 'OUT.isBack' is a float value. It is assigned '-1.0' if it is a vertex assigned to the back map, and '1.0' if it a vertex assigned to the front map.

   // Pass unmodified to pixel shader to allow it to clip properly
   OUT.pos = OUT.hpos.xyz;
   
   // Scale and offset so it shows up in the atlas properly
   OUT.hpos.xy = OUT.hpos.xy * atlasScale.xy;
   OUT.hpos.x += (isBack ? 0.5 : -0.5);
   
   return OUT;
The full xyz screen space position is passed to the pixel shader before the POSITION output is adjusted for atlasing. In this case 'atlasScale.xy' is (0.5, 1.0) because I am storing the maps side-by-side in the texture. The x co-ordinate is then shifted so that the triangle is drawn in the proper location in the atlas. Now in the pixel shader:

   Fragout OUT;

   clip(abs(IN.isBack) - 0.999);
   clip(1.0 - length(IN.pos.xy));
   
   OUT.col = encodeShadowMap(IN.pos.z); 

   return OUT;
So, what's going on here? The value that was assigned in the vertex shader, 'OUT.isBack' is getting interpolated across all verts of the triangle. If all verts on the triangle are entirely on the front map, or entirely on the back map, the pixel will pass because abs(IN.isBack) will be 1.0, and subtracting 0.999 will still result in a positive value. Here are some comparison shots of single scene draw vs 2x scene draws creating a Dual Paraboloid shadow map: http://flickr.com/photos/killerbunny/sets/72157610684857219/ Two passes: Dual Render One pass: Single Render You can see right away, that the single pass implementation exhibits the behavior that is described. If the triangle is *entirely* on the front, or back map, it will render properly, otherwise it is clipped. This behavior can actually be ok for shadow maps. It depends where the light is, and what kind of things are going to come into it's radius. So those are the results so far. I have a few thoughts on improving the quality of the single pass maps. 1. The maps could be rotated so that the "floor" area on both maps pointed twords the middle of the map. The shader code could then be adjusted to allow the front and back to stretch triangles to each-other along that axis. This would eliminate or reduce the artifacts along one seam of the maps. 2. Hardware N-Patch. This feature seems to have gone the way of the dodo which is too bad because I think it would actually be useful for this case. IIRC the Xbox 360 has hardware support for N-Patch tessellation, that could actually work.
Advertisement
(The first post is the one where I look like I know what I'm talking about. The second post is the one where I don't [totally])

The third idea for improving the quality is kind of harebrained. What if the same vertex stream is bound twice and the second binding is offset by 1. This would give the vertex shader some information about other verts in the triangle being drawn. With this information, the output XYZ and W co-ordinate could be adjusted so that the vertex position interpolation is also being used for our own sinister plans.

Maybe I have been looking at the wrong diagrams, but I'm not entirely sure where the index buffer comes in to play. I think that by the time the vertex hits the vertex shader, indexing has been performed, because otherwise we'd be executing vertex shaders on degenerate verts. This leads me to believe that if a vertex stream was bound multiple times (if that's even allowable, docs don't say) and offset, that some kind of adjacency information could be available to the vertex shader.
What I would do to reduce the artefacts along the seam is tweaking the projection a little bit so that you actually render more than you need into the maps. They overlap than and then you just pick the right part :-) ... it is a hack but it might work when everything is in world space.

It is always worth to compare this to cube map. Here are a few interesting facts:
- the number of drawcalls is the same ... so you do not save on this front
- you loose memory bandwidth with cube maps because in worst case you render everything into six maps that are probably bigger than 256x256 ... in reality you won't render six times and therefore have less drawcalls than dual-paraboloid maps
- the quality is much better for cube maps
- the speed difference is not that huge because dual paraboloid maps use things like texkill or alpha test to pick the right map and therefore rendering is pretty slow without Hierarchical Z.

I think both techniques are equivalent for environment maps .. for shadows you might prefer cube maps; if you want to save memory dual-paraboloid maps is the only way to go.

Tesselation unit: well if your only platform is the 360 then this might make sense. I believe PC ATI graphics cards also give you a tesselator unit but from here you are on your own. All the DX8 tesselation stuff was buried even before the high time of DX8 was over.
Quote:Original post by wolf
What I would do to reduce the artefacts along the seam is tweaking the projection a little bit so that you actually render more than you need into the maps. They overlap than and then you just pick the right part :-) ... it is a hack but it might work when everything is in world space.

That is a good suggestion - usually just biasing the clipping plane a little bit does the trick in the same fashion.

Quote:Original post by wolf
It is always worth to compare this to cube map. Here are a few interesting facts:
- the number of drawcalls is the same ... so you do not save on this front
- you loose memory bandwidth with cube maps because in worst case you render everything into six maps that are probably bigger than 256x256 ... in reality you won't render six times and therefore have less drawcalls than dual-paraboloid maps

How can you possibly have fewer draw calls for DPMs? In the worst case, there will be two draw calls per object for DPMs, and worst case for CMs is six per object - best case for both is one call per object. How could it be better for cube maps?

Quote:Original post by wolf
- the quality is much better for cube maps

If you use the same amount of memory for both, the quality difference will be pretty close (although cube maps will still be better). To make a fair comparison, you need to make sure that the two paraboloid maps use the same memory as six cube map faces...

Quote:Original post by wolf
- the speed difference is not that huge because dual paraboloid maps use things like texkill or alpha test to pick the right map and therefore rendering is pretty slow without Hierarchical Z.

I think this is probably dependant on the scene that you are rendering. If there isn't a lot of overdraw then losing hi-z won't make that big of a difference. If there is a lot of overdraw, then probably it will make a big difference.

Quote:Original post by wolf
I think both techniques are equivalent for environment maps .. for shadows you might prefer cube maps; if you want to save memory dual-paraboloid maps is the only way to go.

Can you elaborate on the differences? From my perspective, the quality differences would be the same between EM and SM. If anything, I would say that EM would be better on cube maps due to need to prevent weird texture warping from the paraboloid projection. Do you have any insight into why it would be better for SMs?
Quote:How can you possibly have fewer draw calls for DPMs? In the worst case, there will be two draw calls per object for DPMs, and worst case for CMs is six per object - best case for both is one call per object. How could it be better for cube maps?
I assume proper culling is going on ... but you are right if there are a lot of objects that need to be drawn into more than one map it might be worse with the cube map.

Quote:Can you elaborate on the differences? From my perspective, the quality differences would be the same between EM and SM. If anything, I would say that EM would be better on cube maps due to need to prevent weird texture warping from the paraboloid projection. Do you have any insight into why it would be better for SMs?

By observing the quality differences I just had the impression that with an environment map you can easier overlook the error while with a shadow it is most of the time pronounced due to the strong contrast between shadowed and lit areas.
Put some more thinking into this.

Instead of atlasing the front and back map, I switched to using an R16G16B16A16F texture, and storing front in RG, and back in BA. So the pixel shader now looks like:
   Fragout OUT;   OUT.col = float4(IN.pos.z, IN.pos.z * IN.pos.z, 1.0, 1.0);      if(IN.isBack < 0.0)      OUT.col = OUT.col.zwxy;   return OUT;

Drawing front and back to the same area causes problems with the z-buffer, so I disabled it, and set the alpha blend mode to D3DBLENDOP_MIN. This lets the alpha blend do the work of the z-buffer (this seems somewhat scandalous).

This is a 256x256 shadow map for a light with a large radius.

Shadow Map:
Front and Back Omni Shadow Map

Front:
Front(RG)

Back:
Back (BA)

Shadows cast by red omni light (radius < 2x size of scene):
Shadows cast by red omni light

This implementation does suffer a bit more from poorly tessellated geometry, but seems to have some promise.

Apologies for the bad test art. I really need to get some good geometry that can test some more corner cases.

[Edited by - patw on December 7, 2008 5:13:10 PM]
I'd like to see this working in a somewhat simpler, clearer scene. Something like a small room with some objects like a table and chairs or whatever.

However, it looks very promising to me. I cant agree with Wolf that the same number of draw calls would be issued for both the cubemap approach and single pass DP...what about the common case when a mesh appears in more than one face of the cubemap? This happens a lot. Particularly when you are using large batches of polygons.

Also, in a cubemap you have to switch and clear render targets 6 times per light.. this is not cheap.

However, it would also be nice to see some simple peformance and quality comparisons with your method and standard cubemaps...are you actually getting more bang for the buck?
Quality is not up to snuff with regular DPMs right now. The seams are wider and show increased errors. Looking closely at the maps and values, it seems like polygons are curving too early; while they are still on the front map, they are curving like they are on the back map. The most likely cause for this (as far as I have come up with) is the clip step, or lack thereof.

IIRC, when a vertex is clipped, triangles are created with verts at the clip plane so that the non-clipped vertices show up properly. When drawing the front and back map separately, the clip step is allowed to function properly. With this implementation, vertices are never allowed to clip; their z-values get wrapped around, causing them to create different geometry than would be generated by rendering front and back individually.

My thinking on this is to warp the X and Y co-ordinates of vertices near the seams and push them to the edge of the paraboloid. My thinking is that this comes close to simulating the effect of a proper clipping step.

This topic is closed to new replies.

Advertisement