# Speed difference in shadow mapping, nonlinear vs. linear depth?

## Recommended Posts

AgentC    2352
Hi, I observed a curious thing about shadow mapping using HLSL. If I store nonlinear depth (ie. projected z/w) in the shadow map I get approximately a 5% speedup compared to storing linear depth. According to my limited tests, it's the shadow caster code that makes the difference, which is 2 matrix multiplies in the VS for linear versus just 1 for nonlinear, but on the other hand linear depth PS just writes out the value produced by the VS, while nonlinear depth PS has to do a divide. Should the matrix multiply (considering it's just for each vertex) make that much of a difference, or is there some other reason? Relevant shader code: Shadow caster, nonlinear depth:
VS:
oPos = mul(iPos, cModelViewProj);
oDepth = oPos.zw;

PS:
oColor = iDepth.x / iDepth.y;

VS:
oPos = mul(iPos, cModelViewProj);

PS:
oColor = iDepth;

VS:

PS:

VS:

PS:


##### Share on other sites
MJP    19754
It really depends on the GPU. I usually find that I'm bound by the input assembler or triangle setup unit during shadow map generation...I would suggest that you use a program like PerfHUD and find out exactly what your bottleneck is.

Also...if you want linear depth, you can try this method.

##### Share on other sites
AgentC    2352
Thanks for the suggestion! By implementing linear Z in the way described I got a speedup over nonlinear depth :)

Ah, of course now I understand where (part) of the speed difference probably comes, in addition to actual shader code: whether one has to update 1 matrix uniform or 2, for each shadowcasting object.

EDIT: when rendering point light shadow maps, using the linear Z method resulted in some odd artifacts: objects off to the sides got rendered through walls that were closer :) I guess that's a side-effect of pre-multiplying Z by W, and having geometry that's not very well tessellated. In the end I wrote different vertex shaders for perspective & orthographic shadow projection, so that the actual projection can remain unmodified, but still only 1 matrix is needed for calculating both the vertex position, and linear depth.

[Edited by - AgentC on March 14, 2010 3:00:32 PM]