Sign in to follow this  

OpenGL [CSM] Cascaded Shadow Maps split selection

This topic is 1271 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi,

 

I'm working on a CSM and I don't know which way should I choose. I'm using a geometry prepess which gives me a depth map from the camera's view, so I'm using a separate fullscreen pass to compute the shadows (so I can't use the clip planes as a solution)

 

1)

- create [numSplits] render target, render each shadow map to the right buffer

- switch to shadow calculation pass

- bind every texture to the shader

- in the shader, use dynamic branching, like

if (dist < split1) { texture2D(shadowmap1, texcoord); ... }

 

2)

- create only one render target and draw the shadow maps as a texture atlas (left-top is the first split, right-top is the second, etc...)

- switch to shadow calculation pass

- bind the only one texture

- in the shader, use dynamic branching, which calculates the texcoords where the shadow map should be sampled.

 

And here comes my problems with both way. The target platform is OpenGL 2.0 (think to SM2).

 

1)

If I know well, the dynamic branching in a shader under SM3 is a "fake" solution. So it will compute every branch and evaluate after. It won't be so fast to compute shadows for each split then make the decision later. Especially, I'm using PCF, and in SM2, the instruction count is not infinite. smile.png

 

2)

With 4 splits and 1024 shadow maps, the texture size would be 2048x2048, And maybe this is the best case... imagine 2048 shadow maps which would use 4096x4096 texture.

 

However the 2nd solution still looks more viable. But I'm not sure about the texture arrays in OpenGL 2, is it available?

 

Thanks,

csisy

Edited by csisy

Share this post


Link to post
Share on other sites

It's been a little while since I've done CSM, but I did the texture atlas approach and then to avoid the branch I would pass in the tile size as a uniform and divide the tex coord by the tile width and height to determine what tile to sample from. Then I would have a uniform that determined the depth range between the two cascades that I would want to lerp between, sample from both maps and lerp based on the distance to the split.

 

I'm a little hazy on the details, so I apologize with the vague wording, but I hope you get the gist of what I'm saying.

Share this post


Link to post
Share on other sites

It's been a little while since I've done CSM, but I did the texture atlas approach and then to avoid the branch I would pass in the tile size as a uniform and divide the tex coord by the tile width and height to determine what tile to sample from. Then I would have a uniform that determined the depth range between the two cascades that I would want to lerp between, sample from both maps and lerp based on the distance to the split.

 

I'm a little hazy on the details, so I apologize with the vague wording, but I hope you get the gist of what I'm saying.

 

Then it seems the atlas approach can work. Good news :)

 

I think in this case the branching wouldn't be so bad, because I have to compute only the texcoord offsets. Or just an index and use an uniform array of texcoord offsets, since it can be calculated by doing: texcoord * 0.5 + offset

 

Thanks for your reply, I'll try this way. :)

Share this post


Link to post
Share on other sites

Now it works fine. I pass the texture offsets, matrices and split distances as uniform arrays. In the fragment shader, I get the proper id (with an if check in a for loop), then use this id to select the proper values from the arrays.

 

Something like this:

#define NumSplits 4
uniform     float       distances[NumSplits];
uniform     mat4        shadowMats[NumSplits];
uniform     vec2        offsets[NumSplits];


float dist = depthVal * farPlane;
int id;


for (id = 0; id < NumSplits; ++id)
{
    if (dist < distances[id])
        break;
}


[...]


shadow.xy += offsets[id];


float shadowVal = 0.0;
shadowVal += step(shadow.z, texture2D(textShadowMap, shadow.xy).r);
[...]

And the result:

 

27xl1t4.jpg

 

Thanks to Katarina (by Riot) biggrin.png

 

[EDIT]
However,this solution works only if the max shadow distance == camera far plane distance. If not, the shader will try to access the array with an invalid id. It can be avoided with an extra if, like:

if (id == NumSplits)
{
    gl_FragData[0] = vec4(1.0);
    return;
}
Edited by csisy

Share this post


Link to post
Share on other sites

I find that approach restrictive.

I use 2 render targets to handle CSM, one for the actual split (say 4096x4096) and the other is screen-size buffer in which i accumulate all shadows.

for_each_split
{
     draw_to_shadow_rt
     shadows_buffer += shadow_rt
}

 

this way i can have arbitrary number of splits/cascades.

 

4 splits

splits4.jpg

 

9 splits

splits9.jpg

Share this post


Link to post
Share on other sites

And how do you do the following part? Simply alpha blending?

shadows_buffer += shadow_rt

 

Because this was the first thing for me, but I'm not sure about how I can "merge" the calculated shadows. The blending has high cost, doesn't it? Btw, I'm okay with the 4 splits and 1024 textures yet, but I can agree with you, this approach is restrictive.

Share this post


Link to post
Share on other sites

I use additive blending for accumulation (srcblend = one & destblend = one) also i reuse same shadow_rt for the rest of the lights (spot, point), and i do the lightning on the same pass (deferred rendering).

 

edit: to not get confused

 

 


And how do you do the following part? Simply alpha blending?

shadows_buffer += shadow_rt

 

i pass shadow_rt as texture when i draw to shadow_buffer to merge + do the lightning.

 

edit2: might be worth mentioning, in csm pass i use clip function to get correct result because of overlapping, and might be faster then branching?

// on top of the pixel shader
float4 posVS    = tex2D(gbPosition_samp, IN.TexCoord0); // view space position

    clip(posVS.z - SplitDist.x);
    clip(SplitDist.y - posVS.z);
Edited by belfegor

Share this post


Link to post
Share on other sites

I'll try this solution, and check which one is faster for 4 splits. The clip() function is not available in GLSL, it can be evaluated by doing an if check then call discard if it was true.

 

I'm using deferred shading also, however, I won't compute shadows and lighting in one pass, since the SM2 instruction count is not infinite, and the shading and shadow calculation would reach the limit.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Similar Content

    • By _OskaR
      Hi,
      I have an OpenGL application but without possibility to wite own shaders.
      I need to perform small VS modification - is possible to do it in an alternative way? Do we have apps or driver modifictions which will catch the shader sent to GPU and override it?
    • By xhcao
      Does sync be needed to read texture content after access texture image in compute shader?
      My simple code is as below,
      glUseProgram(program.get());
      glBindImageTexture(0, texture[0], 0, GL_FALSE, 3, GL_READ_ONLY, GL_R32UI);
      glBindImageTexture(1, texture[1], 0, GL_FALSE, 4, GL_WRITE_ONLY, GL_R32UI);
      glDispatchCompute(1, 1, 1);
      // Does sync be needed here?
      glUseProgram(0);
      glBindFramebuffer(GL_READ_FRAMEBUFFER, framebuffer);
      glFramebufferTexture2D(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
                                     GL_TEXTURE_CUBE_MAP_POSITIVE_X + face, texture[1], 0);
      glReadPixels(0, 0, kWidth, kHeight, GL_RED_INTEGER, GL_UNSIGNED_INT, outputValues);
       
      Compute shader is very simple, imageLoad content from texture[0], and imageStore content to texture[1]. Does need to sync after dispatchCompute?
    • By Jonathan2006
      My question: is it possible to transform multiple angular velocities so that they can be reinserted as one? My research is below:
      // This works quat quaternion1 = GEQuaternionFromAngleRadians(angleRadiansVector1); quat quaternion2 = GEMultiplyQuaternions(quaternion1, GEQuaternionFromAngleRadians(angleRadiansVector2)); quat quaternion3 = GEMultiplyQuaternions(quaternion2, GEQuaternionFromAngleRadians(angleRadiansVector3)); glMultMatrixf(GEMat4FromQuaternion(quaternion3).array); // The first two work fine but not the third. Why? quat quaternion1 = GEQuaternionFromAngleRadians(angleRadiansVector1); vec3 vector1 = GETransformQuaternionAndVector(quaternion1, angularVelocity1); quat quaternion2 = GEQuaternionFromAngleRadians(angleRadiansVector2); vec3 vector2 = GETransformQuaternionAndVector(quaternion2, angularVelocity2); // This doesn't work //quat quaternion3 = GEQuaternionFromAngleRadians(angleRadiansVector3); //vec3 vector3 = GETransformQuaternionAndVector(quaternion3, angularVelocity3); vec3 angleVelocity = GEAddVectors(vector1, vector2); // Does not work: vec3 angleVelocity = GEAddVectors(vector1, GEAddVectors(vector2, vector3)); static vec3 angleRadiansVector; vec3 angularAcceleration = GESetVector(0.0, 0.0, 0.0); // Sending it through one angular velocity later in my motion engine angleVelocity = GEAddVectors(angleVelocity, GEMultiplyVectorAndScalar(angularAcceleration, timeStep)); angleRadiansVector = GEAddVectors(angleRadiansVector, GEMultiplyVectorAndScalar(angleVelocity, timeStep)); glMultMatrixf(GEMat4FromEulerAngle(angleRadiansVector).array); Also how do I combine multiple angularAcceleration variables? Is there an easier way to transform the angular values?
    • By dpadam450
      I have this code below in both my vertex and fragment shader, however when I request glGetUniformLocation("Lights[0].diffuse") or "Lights[0].attenuation", it returns -1. It will only give me a valid uniform location if I actually use the diffuse/attenuation variables in the VERTEX shader. Because I use position in the vertex shader, it always returns a valid uniform location. I've read that I can share uniforms across both vertex and fragment, but I'm confused what this is even compiling to if this is the case.
       
      #define NUM_LIGHTS 2
      struct Light
      {
          vec3 position;
          vec3 diffuse;
          float attenuation;
      };
      uniform Light Lights[NUM_LIGHTS];
       
       
    • By pr033r
      Hello,
      I have a Bachelor project on topic "Implenet 3D Boid's algorithm in OpenGL". All OpenGL issues works fine for me, all rendering etc. But when I started implement the boid's algorithm it was getting worse and worse. I read article (http://natureofcode.com/book/chapter-6-autonomous-agents/) inspirate from another code (here: https://github.com/jyanar/Boids/tree/master/src) but it still doesn't work like in tutorials and videos. For example the main problem: when I apply Cohesion (one of three main laws of boids) it makes some "cycling knot". Second, when some flock touch to another it scary change the coordination or respawn in origin (x: 0, y:0. z:0). Just some streng things. 
      I followed many tutorials, change a try everything but it isn't so smooth, without lags like in another videos. I really need your help. 
      My code (optimalizing branch): https://github.com/pr033r/BachelorProject/tree/Optimalizing
      Exe file (if you want to look) and models folder (for those who will download the sources):
      http://leteckaposta.cz/367190436
      Thanks for any help...

  • Popular Now