This topic is 3313 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hi, I have been wondering about implementing parallel-split shadow mapping. Do most people apply all the shadow slices in a single pass or one pass for each slice? Suppose we do do it in one pass. Then we have several projection matrices for the shadow. Usually I would transform a vertex with this matrix in the vertex shader and use that in the pixel shader. Should I do this for every matrix now, or should I pass the vertex to the pixel shader and transform it with the projection matrix there? On the one hand, I have to do a transformation for every pixel now, on the other hand, I have to pass less data between vertex and pixel shader. What is the common solution to this?

##### Share on other sites
For rendering the shadow map I do it with 1 pass per split. The only way I know of to do multiple splits per split is to use a geometry shader to duplicate geometry that intersects with multiple splits, similar to CubeMapGS sample in the DX SDK. However I'm not sure it's really worth the effort, since for cube maps that technique tends to be a wash performance-wise to the slow GS performance for amplification.

For generating the actual shadow occlusion, I do it with 1 pass. I just figure out the view-space depth of a fragment, then branch on that and choose the correct shadow projection matrix to apply to the fragment position so I can do the shadow map comparison.

##### Share on other sites
Hi,
I also plan to generate the shadowmap 1 pass per split.
I believe the way you do the shadow occlusion is done in the same way I tried to describe. Thanks for the info.

##### Share on other sites
Quote:
 Original post by MJPFor generating the actual shadow occlusion, I do it with 1 pass. I just figure out the view-space depth of a fragment, then branch on that and choose the correct shadow projection matrix to apply to the fragment position so I can do the shadow map comparison.

I do the same thing, but from looking at ps asm output, the HLSL compiler will not generate any branch instructions in SM2. In SM3 you can get branching if the sampling of the shadow map is not part of the branch (you cannot do texture sampling inside a conditional). Also, the sampler index must be a literal number, not a loop variable. So this means that for a 4-split setup, you will be sampling shadow depth from all 4 shadow maps for every fragment, 3 results are thrown out, the relevant one is used. This is the best method I have seen so far - any way to do this more efficiently?

##### Share on other sites
Quote:
 Original post by Scoob DroolinsIn SM3 you can get branching if the sampling of the shadow map is not part of the branch (you cannot do texture sampling inside a conditional).

Actually this isn't true. What you can't do is have calculate the gradients for mipmapping inside a branch, since adjacent pixels in a 2x2 quad may not take the same branch. You can get around it by explicitly calculating your gradients outside the branch using ddx() and ddy(), and then passing those in to tex2Dgrad. Or, if you're not using mipmaps (which is usually the case for sampling a shadow map), you can just specify that you won't the top mip level with tex2Dlod and then you don't need the gradients.

Personally I don't even use dynamic branching for selecting the right shadow matrix...I just flatten the branch because this is quicker on my target platform (Xbox 360).

Quote:
 Original post by Scoob DroolinsAlso, the sampler index must be a literal number, not a loop variable. So this means that for a 4-split setup, you will be sampling shadow depth from all 4 shadow maps for every fragment, 3 results are thrown out, the relevant one is used. This is the best method I have seen so far - any way to do this more efficiently?

This is true, but you don't need to have your splits as 4 separate textures. If you put them all on one large shadowmap texture, then instead of a sampler index you just need an offset for your texture coordinate. My HLSL code looks like this:
// Determine which shadow projection matrix to use[unroll(NUM_SPLITS - 1)]for (int i = 1; i < NUM_SPLITS; i++){		[flatten]		if (vPositionVS.z <= g_vClipPlanes.x && vPositionVS.z > g_vClipPlanes.y)	{		matLightViewProj = g_matLightViewProj;		fOffset = i / (float)NUM_SPLITS;		vColor = vSplitColors;		iCurrentSplit = i;	}}		// Determine the depth of the pixel with respect to the lightfloat4x4 matViewToLightViewProj = mul(g_matInvView, matLightViewProj);float4 vPositionLightCS = mul(vPositionVS, matViewToLightViewProj);float fLightDepth = vPositionLightCS.z / vPositionLightCS.w;	// Transform from light space to shadow map texture space.float2 vShadowTexCoord = 0.5 * vPositionLightCS.xy / vPositionLightCS.w + float2(0.5f, 0.5f);vShadowTexCoord.x = vShadowTexCoord.x / NUM_SPLITS + fOffset;vShadowTexCoord.y = 1.0f - vShadowTexCoord.y;    // Offset the coordinate by half a texel so we sample it correctlyvShadowTexCoord += (0.5f / g_vShadowMapSize);

So then vShadowTexCoord would be the texture coordinate I use for sampling the shadow map, and fLightDepth would be the depth value I compare against what I sample from the shadow map.

##### Share on other sites
MJP: Do you know of a way to "just" calculate the split index in the pixel shader? Maybe I should try your method, although I don't know what [flatten] does.
Also I don't really understand the way you compare vPositionVS.z (which is the pixel depth in respect to the viewer?) with the clipplane?
Could you explain the meaning of this?
if (vPositionVS.z <= g_vClipPlanes.x && vPositionVS.z > g_vClipPlanes.y)
Why won't it be true for every split that is "deeper" than the current pixel?

I saw something about the frostbite engine, where they explain, that they generate the shadowmap for all splits in one pass using a texture array and geometry shaders, although I did not understand what exactly they do. Edit: I guess this is what you referred to in your first reply.

Another thing they do, is using bounding-box splitting (or something like that) instead of planes. I did not really understand that either.

On a more general note: How many splits/shadowmap-size do you use? I suppose it makes sense to decide on a number of splits and a certain size and then just try to make that look good.

##### Share on other sites
Quote:
 Original post by B_oldMJP: Do you know of a way to "just" calculate the split index in the pixel shader?

Well I do calculate the index of the split in my code...I stick it in "iSplitIndex".

Quote:
 Original post by B_oldMaybe I should try your method, although I don't know what [flatten] does.

[flatten] is an HLSL attribute that tells the compiler that it shouldn't use dynamic branching. It's just a performance enhancement, it should have no effect on the result.

Quote:
 Original post by B_oldAlso I don't really understand the way you compare vPositionVS.z (which is the pixel depth in respect to the viewer?) with the clipplane? Could you explain the meaning of this?if (vPositionVS.z <= g_vClipPlanes.x && vPositionVS.z > g_vClipPlanes.y)Why won't it be true for every split that is "deeper" than the current pixel?

Sorry, I probably should have explained that. What I do is I have an array of float2's in my shader, with the same number of elements as there are splits in my cascade. Like this:
float2 g_vClipPlanes[NUM_SPLITS];

Then for each split, I set the Z value (with respect to the camera) where the split starts, and where the split ends. So the array looks something like this:
g_vClipPlanes[0] = {-1, -25};g_vClipPlanes[1] = {-25, -75};g_vClipPlanes[2] = {-75, -200};g_vClipPlanes[3] = {-200, -500};

They're negative because I'm using right-handed coordinates. This is also probably why the inequality looks backwards.

Quote:
 Original post by B_oldI saw something about the frostbite engine, where they explain, that they generate the shadowmap for all splits in one pass using a texture array and geometry shaders, although I did not understand what exactly they do. Edit: I guess this is what you referred to in your first reply.

Yup, that's exactly what I was talking about. Nice if you're targeting D3D10, but you can't do it in D3D9.

Quote:
 Original post by B_oldAnother thing they do, is using bounding-box splitting (or something like that) instead of planes. I did not really understand that either.

Hmm...I'm not totally sure either. What I do is I calculate the parallel splits, then for each split I come up with a bounding box that's aligned with the light source. Then I use the bounding box to to come up with an orthographic projection. This is pretty standard though, so I'd imagine they're doing something fancier.

Quote:
 Original post by B_oldOn a more general note: How many splits/shadowmap-size do you use? I suppose it makes sense to decide on a number of splits and a certain size and then just try to make that look good.

For the game I'm working on I use 4 splits, each sized 1024x1024.

Also I did make a sample a little while ago that you could look at. The only problem is that it's an extension of an earlier sample and I never got around to doing a new write-up, so the write-up in the zip is for the older sample that doesn't use PSSM. The code all works though. The other issues are that it uses XNA (which I'm not sure if you're using), and it uses deferred shadow maps which changes things a little bit. You should be able to figure out what's going on though regardless, it's pretty straightforward.

##### Share on other sites
First of all: Thanks for the input!

Right now I am determining the split by a series of if else statements (if the fragment is not in the closest split it might be in the next one and so on...). But I will try your method. Maybe it improves performance to get rid of the elses.

I can't run your sample because I am missing the XNA stuff, but that deferred shadowmapping part got me thinking.
On the one hand I really like the way it cleans up the lighting shaders. Right now I am comparing hardware shadow maps (using the depthstencil) with variance shadow maps an the effectfiles are already getting ugly. On the other hand, once (if) I decide on one definite shadow technique, the deferred shadow mapping is just another overhead, isn't it?

##### Share on other sites
Quote:
 Original post by B_oldOn the other hand, once (if) I decide on one definite shadow technique, the deferred shadow mapping is just another overhead, isn't it?

No, not necessarily. They can actually be a pretty big performance win, depending on what you're doing. This is because you get better pixel shader performance when you do all of the shadow mapping in one full-screen pass, because you're always utilizing all 4 shaders in a quad. This doesn't happen when you render your scene geometry, since triangles won't always cover all 4 pixels. It actually gets worse as your polygon count goes up and triangles get smaller and smaller. So in the end it's about whether the increased efficiency balances out the overhead of seperating shadowing and lighting into two parts. For me it was a noticeable improvement on the 360, which is why I started using it in the first place. Simplifying my shaders was a nice side benefit though...it let me use PCF for the 360 and leave the option for VSM's on the PC.

##### Share on other sites
Sounds intriguing.
Do you use the depthstencil buffer or do you write the z-values into a separate texture?

• 22
• 10
• 19
• 15
• 14
×

## Important Information

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!