• Create Account

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

7 replies to this topic

### #1csisy  Members

400
Like
0Likes
Like

Posted 25 July 2014 - 11:14 AM

Hi,

I'm working on a CSM and I don't know which way should I choose. I'm using a geometry prepess which gives me a depth map from the camera's view, so I'm using a separate fullscreen pass to compute the shadows (so I can't use the clip planes as a solution)

1)

- create [numSplits] render target, render each shadow map to the right buffer

- switch to shadow calculation pass

- bind every texture to the shader

- in the shader, use dynamic branching, like

if (dist < split1) { texture2D(shadowmap1, texcoord); ... }

2)

- create only one render target and draw the shadow maps as a texture atlas (left-top is the first split, right-top is the second, etc...)

- switch to shadow calculation pass

- bind the only one texture

- in the shader, use dynamic branching, which calculates the texcoords where the shadow map should be sampled.

And here comes my problems with both way. The target platform is OpenGL 2.0 (think to SM2).

1)

If I know well, the dynamic branching in a shader under SM3 is a "fake" solution. So it will compute every branch and evaluate after. It won't be so fast to compute shadows for each split then make the decision later. Especially, I'm using PCF, and in SM2, the instruction count is not infinite.

2)

With 4 splits and 1024 shadow maps, the texture size would be 2048x2048, And maybe this is the best case... imagine 2048 shadow maps which would use 4096x4096 texture.

However the 2nd solution still looks more viable. But I'm not sure about the texture arrays in OpenGL 2, is it available?

Thanks,

csisy

Edited by csisy, 25 July 2014 - 11:15 AM.

### #2Seabolt  Members

781
Like
2Likes
Like

Posted 25 July 2014 - 03:35 PM

It's been a little while since I've done CSM, but I did the texture atlas approach and then to avoid the branch I would pass in the tile size as a uniform and divide the tex coord by the tile width and height to determine what tile to sample from. Then I would have a uniform that determined the depth range between the two cascades that I would want to lerp between, sample from both maps and lerp based on the distance to the split.

I'm a little hazy on the details, so I apologize with the vague wording, but I hope you get the gist of what I'm saying.

Perception is when one imagination clashes with another

### #3csisy  Members

400
Like
0Likes
Like

Posted 26 July 2014 - 01:32 AM

It's been a little while since I've done CSM, but I did the texture atlas approach and then to avoid the branch I would pass in the tile size as a uniform and divide the tex coord by the tile width and height to determine what tile to sample from. Then I would have a uniform that determined the depth range between the two cascades that I would want to lerp between, sample from both maps and lerp based on the distance to the split.

I'm a little hazy on the details, so I apologize with the vague wording, but I hope you get the gist of what I'm saying.

Then it seems the atlas approach can work. Good news

I think in this case the branching wouldn't be so bad, because I have to compute only the texcoord offsets. Or just an index and use an uniform array of texcoord offsets, since it can be calculated by doing: texcoord * 0.5 + offset

### #4csisy  Members

400
Like
0Likes
Like

Posted 26 July 2014 - 06:17 AM

Now it works fine. I pass the texture offsets, matrices and split distances as uniform arrays. In the fragment shader, I get the proper id (with an if check in a for loop), then use this id to select the proper values from the arrays.

Something like this:

#define NumSplits 4
uniform     float       distances[NumSplits];
uniform     vec2        offsets[NumSplits];

float dist = depthVal * farPlane;
int id;

for (id = 0; id < NumSplits; ++id)
{
if (dist < distances[id])
break;
}

[...]

[...]

And the result:

Thanks to Katarina (by Riot)

[EDIT]
However,this solution works only if the max shadow distance == camera far plane distance. If not, the shader will try to access the array with an invalid id. It can be avoided with an extra if, like:

if (id == NumSplits)
{
gl_FragData[0] = vec4(1.0);
return;
}

Edited by csisy, 26 July 2014 - 06:23 AM.

### #5belfegor  Members

2833
Like
1Likes
Like

Posted 26 July 2014 - 10:12 AM

I find that approach restrictive.

I use 2 render targets to handle CSM, one for the actual split (say 4096x4096) and the other is screen-size buffer in which i accumulate all shadows.

for_each_split
{
}


this way i can have arbitrary number of splits/cascades.

4 splits

9 splits

### #6csisy  Members

400
Like
0Likes
Like

Posted 26 July 2014 - 10:32 AM

And how do you do the following part? Simply alpha blending?

Because this was the first thing for me, but I'm not sure about how I can "merge" the calculated shadows. The blending has high cost, doesn't it? Btw, I'm okay with the 4 splits and 1024 textures yet, but I can agree with you, this approach is restrictive.

### #7belfegor  Members

2833
Like
1Likes
Like

Posted 26 July 2014 - 11:01 AM

I use additive blending for accumulation (srcblend = one & destblend = one) also i reuse same shadow_rt for the rest of the lights (spot, point), and i do the lightning on the same pass (deferred rendering).

edit: to not get confused

And how do you do the following part? Simply alpha blending?

i pass shadow_rt as texture when i draw to shadow_buffer to merge + do the lightning.

edit2: might be worth mentioning, in csm pass i use clip function to get correct result because of overlapping, and might be faster then branching?

// on top of the pixel shader
float4 posVS    = tex2D(gbPosition_samp, IN.TexCoord0); // view space position

clip(posVS.z - SplitDist.x);
clip(SplitDist.y - posVS.z);


Edited by belfegor, 26 July 2014 - 12:12 PM.

### #8csisy  Members

400
Like
0Likes
Like

Posted 26 July 2014 - 03:25 PM

I'll try this solution, and check which one is faster for 4 splits. The clip() function is not available in GLSL, it can be evaluated by doing an if check then call discard if it was true.

I'm using deferred shading also, however, I won't compute shadows and lighting in one pass, since the SM2 instruction count is not infinite, and the shading and shadow calculation would reach the limit.