[CSM] Cascaded Shadow Maps split selection

Started by
6 comments, last by csisy 9 years, 9 months ago

Hi,

I'm working on a CSM and I don't know which way should I choose. I'm using a geometry prepess which gives me a depth map from the camera's view, so I'm using a separate fullscreen pass to compute the shadows (so I can't use the clip planes as a solution)

1)

- create [numSplits] render target, render each shadow map to the right buffer

- switch to shadow calculation pass

- bind every texture to the shader

- in the shader, use dynamic branching, like

if (dist < split1) { texture2D(shadowmap1, texcoord); ... }

2)

- create only one render target and draw the shadow maps as a texture atlas (left-top is the first split, right-top is the second, etc...)

- switch to shadow calculation pass

- bind the only one texture

- in the shader, use dynamic branching, which calculates the texcoords where the shadow map should be sampled.

And here comes my problems with both way. The target platform is OpenGL 2.0 (think to SM2).

1)

If I know well, the dynamic branching in a shader under SM3 is a "fake" solution. So it will compute every branch and evaluate after. It won't be so fast to compute shadows for each split then make the decision later. Especially, I'm using PCF, and in SM2, the instruction count is not infinite. smile.png

2)

With 4 splits and 1024 shadow maps, the texture size would be 2048x2048, And maybe this is the best case... imagine 2048 shadow maps which would use 4096x4096 texture.

However the 2nd solution still looks more viable. But I'm not sure about the texture arrays in OpenGL 2, is it available?

Thanks,

csisy

sorry for my bad english
Advertisement

It's been a little while since I've done CSM, but I did the texture atlas approach and then to avoid the branch I would pass in the tile size as a uniform and divide the tex coord by the tile width and height to determine what tile to sample from. Then I would have a uniform that determined the depth range between the two cascades that I would want to lerp between, sample from both maps and lerp based on the distance to the split.

I'm a little hazy on the details, so I apologize with the vague wording, but I hope you get the gist of what I'm saying.

Perception is when one imagination clashes with another

It's been a little while since I've done CSM, but I did the texture atlas approach and then to avoid the branch I would pass in the tile size as a uniform and divide the tex coord by the tile width and height to determine what tile to sample from. Then I would have a uniform that determined the depth range between the two cascades that I would want to lerp between, sample from both maps and lerp based on the distance to the split.

I'm a little hazy on the details, so I apologize with the vague wording, but I hope you get the gist of what I'm saying.

Then it seems the atlas approach can work. Good news :)

I think in this case the branching wouldn't be so bad, because I have to compute only the texcoord offsets. Or just an index and use an uniform array of texcoord offsets, since it can be calculated by doing: texcoord * 0.5 + offset

Thanks for your reply, I'll try this way. :)

sorry for my bad english

Now it works fine. I pass the texture offsets, matrices and split distances as uniform arrays. In the fragment shader, I get the proper id (with an if check in a for loop), then use this id to select the proper values from the arrays.

Something like this:


#define NumSplits 4
uniform     float       distances[NumSplits];
uniform     mat4        shadowMats[NumSplits];
uniform     vec2        offsets[NumSplits];


float dist = depthVal * farPlane;
int id;


for (id = 0; id < NumSplits; ++id)
{
    if (dist < distances[id])
        break;
}


[...]


shadow.xy += offsets[id];


float shadowVal = 0.0;
shadowVal += step(shadow.z, texture2D(textShadowMap, shadow.xy).r);
[...]

And the result:

27xl1t4.jpg

Thanks to Katarina (by Riot) biggrin.png

[EDIT]
However,this solution works only if the max shadow distance == camera far plane distance. If not, the shader will try to access the array with an invalid id. It can be avoided with an extra if, like:


if (id == NumSplits)
{
    gl_FragData[0] = vec4(1.0);
    return;
}
sorry for my bad english

I find that approach restrictive.

I use 2 render targets to handle CSM, one for the actual split (say 4096x4096) and the other is screen-size buffer in which i accumulate all shadows.

for_each_split
{
     draw_to_shadow_rt
     shadows_buffer += shadow_rt
}

this way i can have arbitrary number of splits/cascades.

4 splits

splits4.jpg

9 splits

splits9.jpg

And how do you do the following part? Simply alpha blending?

shadows_buffer += shadow_rt

Because this was the first thing for me, but I'm not sure about how I can "merge" the calculated shadows. The blending has high cost, doesn't it? Btw, I'm okay with the 4 splits and 1024 textures yet, but I can agree with you, this approach is restrictive.

sorry for my bad english

I use additive blending for accumulation (srcblend = one & destblend = one) also i reuse same shadow_rt for the rest of the lights (spot, point), and i do the lightning on the same pass (deferred rendering).

edit: to not get confused


And how do you do the following part? Simply alpha blending?

shadows_buffer += shadow_rt

i pass shadow_rt as texture when i draw to shadow_buffer to merge + do the lightning.

edit2: might be worth mentioning, in csm pass i use clip function to get correct result because of overlapping, and might be faster then branching?

// on top of the pixel shader
float4 posVS    = tex2D(gbPosition_samp, IN.TexCoord0); // view space position

    clip(posVS.z - SplitDist.x);
    clip(SplitDist.y - posVS.z);

I'll try this solution, and check which one is faster for 4 splits. The clip() function is not available in GLSL, it can be evaluated by doing an if check then call discard if it was true.

I'm using deferred shading also, however, I won't compute shadows and lighting in one pass, since the SM2 instruction count is not infinite, and the shading and shadow calculation would reach the limit.

sorry for my bad english

This topic is closed to new replies.

Advertisement