DX9 + CSM + HW PCF

Started by
6 comments, last by SIIYA 11 years, 11 months ago
Hi everyone,

I have a few short questions about PCF. I have read many articles and forum topics about this but couldnt find clear answers.
So here are the questions:


1. Can hardware PCF be done in DirectX 9 by using D24X8 depth stencil buffer as texture, and if so, will it work on both ATI and Nvidia cards (which support DST) ?

2. Do you need to use tex2Dproj() function for it, or any texture fetch function will do the job?

3. If you need to use tex2Dproj(), how can one calculate correct shadow coordinates for cascades?


I have working CSM in DirectX, but I am not satisfied with averaging 3x3 depth compare tests to blur the edges. So, I tried using D24X8 depth buffer for my shadow map and 4 sample fetches with tex2Dproj. I keep getting strange projection (shadow map moves with camera and alike) but it seems that HW PCF is really there because shadows look much softer. I can post some code and pictures if someone wants.

Any help will be appriciated.
Sorry for my english and thanks!
Advertisement
Yes, you can do hardware PCF on both AMD and Nvidia hardware. The details for ATI/AMD are here, and the details for Nvidia are here.
Thanks for replying MJP. I was using this ATI article as a guide so it seems I am doing something else wrong. Here are few relevant code lines:

Create depth stencil texture:



D3D9Device->CreateTexture( CSMSize * NUMBER_CASCADES, CSMSize, 1, D3DUSAGE_DEPTHSTENCIL,
D3DFMT_D24X8, D3DPOOL_DEFAULT, &CSMRT, NULL );



Bind the depth stencil texture as an active depth buffer:



LPDIRECT3DSURFACE9 Surface;
ShadowManager.GetCSMRT()->GetSurfaceLevel( 0, &Surface );
D3D9Device->SetDepthStencilSurface( Surface );
SAFE_RELEASE( Surface );


Bind a depth buffer texture as a texture for PCF filtering:

m_pEffect->SetTexture( hShadowMap, ShadowManager()->GetCSMRT() );


Shader code:

// Position is reconstructed position from G-buffer in view space
if( Position.z < CascadeFrustumsEyeSpaceDepths.x )
iCurrentCascadeIndex = 0;
else if( Position.z < CascadeFrustumsEyeSpaceDepths.y )
iCurrentCascadeIndex = 1;
else if( Position.z < CascadeFrustumsEyeSpaceDepths.z )
iCurrentCascadeIndex = 2;

// Transform pixel from view space to light projection space
// ShadowViewProj matrix = CameraViewInverse x LightView x LightProj
float4 Pos = mul( float4(Position,1), ShadowViewProj[iCurrentCascadeIndex] );


float4x4 matTexAdj = { 0.5f , 0.0f, 0.0f, 0.0f,
0.0f, -0.5f, 0.0f, 0.0f,
0.0f, 0.0f, 1.0f, 0.0f,
0.5f, 0.5f, 0.0f, 1.0f };

// Transform to texture space
Pos = mul(Pos, matTexAdj);

float2 PixelKernel[9] =
{
{ -1, -1 },
{ -1, 0 },
{ -1, 1 },
{ 0, -1 },
{ 0, 0 },
{ 0, 1 },
{ 1, -1 },
{ 1, 0 },
{ 1, 1 }
};

float PercentLit = 0.0f;

for( int i = 0; i < 9; ++i )
{
PercentLit += tex2Dproj( ShadowSampler, float4( Pos.xy + PixelKernel * float2( TexelSizeX, TexelSize ), Pos.z, Pos.w ) );
}

PercentLit /= 9.0f;




This works when I use only one cascade, but if there are more cascades projection is wrong or it treats whole buffer as a single cascade. I used to scale sampling cordinates after moving to texture space like this:

ShadowTexC.x *= ShadowPartitionSize; // 1.0 / NUMBER_OF_CASCADES
ShadowTexC.x += (ShadowPartitionSize * (float)iCurrentCascadeIndex );
But now with tex2Dproj() I dont now where to calculate for it

Anyway I know I am close to solution but cant find it for 6 days now. Thanks for your help
Have you considered/tried creating square texture?
It might be easier to manage and share it with all spot/point light casting shadow. This way you can create and reuse only 1 DS RT.

And i think you might want to remove adjustment matrix out of pixel shader and precalculate on CPU.
So:

ShadowViewProj matrix = CameraViewInverse x LightView x LightProj x matTexAdj;



For PCF on ATI i think you need to turn point filtering on ds texture and lerp rgba components. Something like:

float4 shadow = tex2Dproj(samp, coords);
float2 f = frac(coords.xy * SHMapSize);

float s1 = lerp(shadow.x, shadow.y, f.x);
float s2 = lerp(shadow.z, shadow.w, f.x);
return lerp(s1, s2, f.y);


Don't have ATI so i have not tried it.


// multiply blend
alphablend true
srcblend zero
destblend srccolor

for num cascades
draw depth to DF24 ati or D24X8 nv
draw to fullscreen quad (or boxed area) and accumulate with

for num sh. casting point lights
for each "cube" face
draw depth to DF24 ati or D24X8 nv
draw to sphere stencil culled light volumen

for num sh. casting spotlights
draw depth to DF24 ati or D24X8 nv
draw to cone/pyramid stencil culled light volumen

// even use SSAO here with same blending
draw to fullscreen quad

...



EDIT: From ATI article MJP posted:

// Determine Fetch-4 supported: all ATI Radeon cards supporting Fetch-4
also
// support the DF24 depth texture FourCC format so use this for detection.
#define ATI_FOURCC_DF24 ((D3DFORMAT)(MAKEFOURCC(‘D’,’F’,’2’,’4’)))
HRESULT hr;
hr = pd3d->CheckDeviceFormat(AdapterOrdinal, DeviceType, AdapterFormat,
D3DUSAGE_DEPTHSTENCIL, D3DRTYPE_TEXTURE,
ATI_FOURCC_DF24);
BOOL bFetch4Supported = (hr == D3D_OK);
To enable Fetch-4 on a texture sampler (sampler 0 in this example):
#define FETCH4_ENABLE ((DWORD)MAKEFOURCC('G', 'E', 'T', '4'))
#define FETCH4_DISABLE ((DWORD)MAKEFOURCC('G', 'E', 'T', '1'))
// Enable Fetch-4 on sampler 0 by overloading the MIPMAPLODBIAS render
state
pd3dDevice->SetSamplerState(0, D3DSAMP_MIPMAPLODBIAS, FETCH4_ENABLE);
// Set point sampling filtering (required for Fetch-4 to work)
pd3dDevice->SetSamplerState(0, D3DSAMP_MAGFILTER, D3DTEXF_POINT);
pd3dDevice->SetSamplerState(0, D3DSAMP_MINFILTER, D3DTEXF_POINT);
Hi, belfegor,

I will optimize code once I get it working. I am using square texture for point and spot lights, and this one is only for directional light. And maybe I wasnt clear, I got HW PCF working. Its just that I dont get proper projection when using tex2Dproj( samp, coords ) with 3 cascades, so I think my coords are not calculated right.

Here is working implementation without HW PCF and with R32F shadow map:

if( Position.z < CascadeFrustumsEyeSpaceDepths.x )
iCurrentCascadeIndex = 0;
else if( Position.z < CascadeFrustumsEyeSpaceDepths.y )
iCurrentCascadeIndex = 1;
else if( Position.z < CascadeFrustumsEyeSpaceDepths.z )
iCurrentCascadeIndex = 2;

// Transform pixel from camera space to light projection space
// ShadowViewProj matrix = CameraViewInverse x LightView x LightProj
float4 Pos = mul( float4(Position,1), ShadowViewProj[iCurrentCascadeIndex] );

// Transform from light projection space to light texture space.
float2 ShadowTexC = 0.5 * Pos.xy / Pos.w + 0.5;
ShadowTexC.y = 1.0f - ShadowTexC.y;

ShadowTexC.x *= ShadowPartitionSize;
ShadowTexC.x += (ShadowPartitionSize * (float)iCurrentCascadeIndex );


float2 PixelKernel[9] =
{
{ -1, -1 },
{ -1, 0 },
{ -1, 1 },
{ 0, -1 },
{ 0, 0 },
{ 0, 1 },
{ 1, -1 },
{ 1, 0 },
{ 1, 1 }
};

float depthcompare = Pos.z / Pos.w - PCFOffset;

float PercentLit = 0.0f;


for( int i = 0; i < 9; ++i )
{
float shadow = tex2D( ShadowSampler, ShadowTexC + PixelKernel * float2( TexelSizeX, TexelSize ) );
PercentLit += shadow < depthcompare ? 0.0f : 1.0f;
}

PercentLit /= 9.0f;



ShadowTexC.x *= ShadowPartitionSize;
ShadowTexC.x += (ShadowPartitionSize * (float)iCurrentCascadeIndex );

These lines of code put sampling coordinates in the correct cascade and this information is missing in my coords for tex2Dproj() implementation.


EDIT:
The thing is, that above code only works when shadow map is R32F. As soon as I switch it to D24X8 depth stencil texture I get wrong projection no matter which shader code I use. Maybe it has to do something with depth compares? With R32F texture i output shadow depth as color pos.z / pos.w, and with D24X8 depth stencil texture I just use return mul(Pos, WorldViewProjection) in vertex shader.

EDIT: From ATI article MJP posted:

// Determine Fetch-4 supported: all ATI Radeon cards supporting Fetch-4
also
// support the DF24 depth texture FourCC format so use this for detection.
#define ATI_FOURCC_DF24 ((D3DFORMAT)(MAKEFOURCC(‘D’,’F’,’2’,’4’)))
HRESULT hr;
hr = pd3d->CheckDeviceFormat(AdapterOrdinal, DeviceType, AdapterFormat,
D3DUSAGE_DEPTHSTENCIL, D3DRTYPE_TEXTURE,
ATI_FOURCC_DF24);
BOOL bFetch4Supported = (hr == D3D_OK);
To enable Fetch-4 on a texture sampler (sampler 0 in this example):
#define FETCH4_ENABLE ((DWORD)MAKEFOURCC('G', 'E', 'T', '4'))
#define FETCH4_DISABLE ((DWORD)MAKEFOURCC('G', 'E', 'T', '1'))
// Enable Fetch-4 on sampler 0 by overloading the MIPMAPLODBIAS render
state
pd3dDevice->SetSamplerState(0, D3DSAMP_MIPMAPLODBIAS, FETCH4_ENABLE);
// Set point sampling filtering (required for Fetch-4 to work)
pd3dDevice->SetSamplerState(0, D3DSAMP_MAGFILTER, D3DTEXF_POINT);
pd3dDevice->SetSamplerState(0, D3DSAMP_MINFILTER, D3DTEXF_POINT);



It also says: Implementations wishing to use Fetch-4 for Percentage-Closer Filtering should prefer the use of DX9 Depth Stencil Textures instead. This is what im trying to use Its on the page 5 of the article.
I can't say what is exactly wrong, except this:

ShadowTexC.x *= ShadowPartitionSize;
ShadowTexC.x += (ShadowPartitionSize * (float)iCurrentCascadeIndex );

Why do you multiply (first line)? Second line should be enough to get correct offset.

As for fetch-4 try it on single spotlight to see if it works as i suggested and then implement it with CSM.
Still cant get it to work...Belfegor I have sent you PM.

This topic is closed to new replies.

Advertisement