Jump to content
  • Advertisement
Sign in to follow this  

SDSM using intel sample or simplified ?

This topic is 1438 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts


SDSM is the new standard on the directional light shadow mapping.

Is it better to implement like the D3D11 intel sample or a simplified version ?

Matt Petineo has a sample which uses a simplfied version : http://mynameismjp.wordpress.com/2013/09/10/shadow-maps/

Matt Petineo version needs to config a value, the intel version is fully automatic.

Thanks for the help

Edited by Alundra

Share this post

Link to post
Share on other sites

SDSM is basically two things :

* Find best cascade split according to depth buffer.

* Find the shadow camera tighest frustrum by reconstructing pixels world space position and getting their inverse transform by the light matrix.


Intel suggests 2 methods for the first part, one that computes a whole histogram and another which just get the min/max of the depth buffer.

As far as I understand Matt Petineo only implemented the latter and it's not clear if he included the camera tighest frustrum part without reading the code.

However it is sufficient to get automatic cascade split (but you need to manually compute the other axis range using frustrum vertices).


I tried SDSM in OpenGL using the depth histogram and the tighest frustrum part and while it's not pixel perfect it is a good visual improvement if you don't move camera.

However I don't do scene management on the gpu and thus the readback latency was a real blocker since I either had 20 fps no matter how complex the scene was or I got too much shadow artifact in motion. It's likely much better if everything was done on the gpu but you have to rewrite a lot of part of your engine.


The very bad new is that SDSM heavily relies on atomic counter to work and GeForce GPU dont like them. A simple compute shader that used atomics in global memory runs at 1fps framerate on a geforce 680 ; using shared memory didnt help since GeForce driver crashes (on Windows) when it's executed. Of course it may better work on D3D11 but Intel paper also mention the huge cost involved by the histogramm algorithm on GeForce 480. On the other hand I managed to reduce the overhead of both compute shader to < .5ms per frame on a mid range radeon. Such overhead can probably be counterbalanced by better geometry culling as pointed by Intel paper.

I don't know if the performance issue with GeForces can be fixed but it makes SDSM difficult to implement on PC. If you target PS4 or Xbox One you're probably fine though.

Share this post

Link to post
Share on other sites
Sign in to follow this  

  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!