SampleCmpLevelZero

Started by
5 comments, last by steel_3d 10 years, 7 months ago
I was using GatherCmpRed to try and implement PCF for shadow maps but was running into problems. Looking at an NVIDIA presentation (http://developer.download.nvidia.com/presentations/2008/GDC/GDC08_SoftShadowMapping.pdf), I found that

sum += tDepthMap.SampleCmpLevelZero(ShadowSampler, uv + offset, z);

is supposed to do 2x2 PCF in one fetch. However, is this a "driver hack" like they had in DX9? Looking at the documentation for SampleCmpLevelZero, it does not say it does 2x2 PCF.
-----Quat
Advertisement
Nah, it's not a hack anymore. You just need to enable LINEAR filtering in your sampler state to get PCF.

Nah, it's not a hack anymore. You just need to enable LINEAR filtering in your sampler state to get PCF.


Is SampleCmpLevelZero preferred over GatherCmpRed?
-----Quat
Well they do two different things...one gives you a single pre-filtered result and one gives you the 4 depth values from a bilinear sample. So if you just want basic PCF then SampleCmp is the way to go (since it's cheaper), but if you wanted to implement a more advanced filtering kernel then you may want to use GatherCmp so that you can do the filtering yourself in the shader.
Yeah you are right. Thanks.

I'm still concerned why I wasn't able to emulate basic PCF using GatherCmpRed. For the vector returned:

float4 s = GatherCmpRed(...);

What sample points do the components map to (top-left, top-right, bottom-left, bottom-right).
s.x = ?
s.y = ?
s.z = ?
s.w = ?

And would I just use COMPARISON_MIN_MAG_LINEAR_MIP_POINT as the comparison filter?
-----Quat

Yeah you are right. Thanks.

I'm still concerned why I wasn't able to emulate basic PCF using GatherCmpRed. For the vector returned:

float4 s = GatherCmpRed(...);

What sample points do the components map to (top-left, top-right, bottom-left, bottom-right).
s.x = ?
s.y = ?
s.z = ?
s.w = ?

And would I just use COMPARISON_MIN_MAG_LINEAR_MIP_POINT as the comparison filter?


I think that's the right order, although I'm not totally sure. You could do a simple test where you render the screen coordinate of each pixel to the render target, then use PIX to debug a shader where you call GatherRed and GatherGreen to see XY coordinates that were grabbed.

For Gather I've always just used MIN_MAG_MIP_POINT, never tried LINEAR. I have no idea if LINEAR changes the behavior of Gather in any way.

To clarify: DO NOT try to use GatherCmpRed to speed up PCF!

GatherCmpRed simply gives you the 4 raw depth compares that the hardware uses internally to do PCF. You have to do the bilinear filtering yourself, and there's other overhead that makes it not worth it. Only use it if you're doing a special kernel that doesn't need PCF.

For the sake of completeness, below is some code that does 3x3 PCF samples, and 3x3 gather samples.

The gather version is a lot slower because it incurs a lot more adds (4 component vs 1 component), and the compiler uses up a lot more registers (4 instead of 1 per sample) in an effort to hide latency. You might be able to get around this if you're doing a lot more math with each result than just accumulating them up. Compile for yourself and check the microcode with something like nvshaderperf. It's worse than you'd expect.


	float fShadow = 0;

#ifdef DO_HARDWARE_PCF
	fShadow += ShadowTexture.SampleCmpLevelZero(ShadowSampler, vShadowCoord.xyw, vShadowCoord.z, int2(-1, -1));
	fShadow += ShadowTexture.SampleCmpLevelZero(ShadowSampler, vShadowCoord.xyw, vShadowCoord.z, int2(-1,  0));
	fShadow += ShadowTexture.SampleCmpLevelZero(ShadowSampler, vShadowCoord.xyw, vShadowCoord.z, int2(-1,  1));
	fShadow += ShadowTexture.SampleCmpLevelZero(ShadowSampler, vShadowCoord.xyw, vShadowCoord.z, int2( 0, -1));
	fShadow += ShadowTexture.SampleCmpLevelZero(ShadowSampler, vShadowCoord.xyw, vShadowCoord.z, int2( 0,  0));
	fShadow += ShadowTexture.SampleCmpLevelZero(ShadowSampler, vShadowCoord.xyw, vShadowCoord.z, int2( 0,  1));
	fShadow += ShadowTexture.SampleCmpLevelZero(ShadowSampler, vShadowCoord.xyw, vShadowCoord.z, int2( 1, -1));
	fShadow += ShadowTexture.SampleCmpLevelZero(ShadowSampler, vShadowCoord.xyw, vShadowCoord.z, int2( 1,  0));
	fShadow += ShadowTexture.SampleCmpLevelZero(ShadowSampler, vShadowCoord.xyw, vShadowCoord.z, int2( 1,  1));
#else
	float4 vShadow = 0;

	vShadow += ShadowTexture.GatherCmpRed(ShadowSampler, vShadowCoord.xyw, vShadowCoord.z, int2(-1, -1));
	vShadow += ShadowTexture.GatherCmpRed(ShadowSampler, vShadowCoord.xyw, vShadowCoord.z, int2(-1,  0));
	vShadow += ShadowTexture.GatherCmpRed(ShadowSampler, vShadowCoord.xyw, vShadowCoord.z, int2(-1,  1));
	vShadow += ShadowTexture.GatherCmpRed(ShadowSampler, vShadowCoord.xyw, vShadowCoord.z, int2( 0, -1));
	vShadow += ShadowTexture.GatherCmpRed(ShadowSampler, vShadowCoord.xyw, vShadowCoord.z, int2( 0,  0));
	vShadow += ShadowTexture.GatherCmpRed(ShadowSampler, vShadowCoord.xyw, vShadowCoord.z, int2( 0,  1));
	vShadow += ShadowTexture.GatherCmpRed(ShadowSampler, vShadowCoord.xyw, vShadowCoord.z, int2( 1, -1));
	vShadow += ShadowTexture.GatherCmpRed(ShadowSampler, vShadowCoord.xyw, vShadowCoord.z, int2( 1,  0));
	vShadow += ShadowTexture.GatherCmpRed(ShadowSampler, vShadowCoord.xyw, vShadowCoord.z, int2( 1,  1));

	// The samples come in this order:
	// W Z
	// X Y
	// Meaning W is in the -1,-1 uv direction and Y is in the +1,+1 direction.
	float4 vLerp;
	vLerp.wz	= frac(vShadowCoord.xy / 1024 + 0.5); 
	vLerp.xy	= (float2)1 - vLerp.wz;
	vLerp		= vLerp.xwwx * vLerp.zzyy;

	fShadow = dot(vShadow, vLerp) / 9;
#endif

	return fShadow;

This topic is closed to new replies.

Advertisement