Why can't i use tex2D in loop with HLSL?

Graphics and GPU Programming Programming

Started by xujiezhige March 08, 2012 08:18 AM

3 comments, last by xujiezhige 12 years, 1 month ago

Author

March 08, 2012 08:18 AM

Greeting.

When i want to use tex2D in loop in PixelShader, the compiler outputs errors:

1.warning X3553: can't use gradient instructions in loops with break, forcing loop to unroll.
2.error X3511: Unable to unroll loop, loop does not appear to terminate in a timely manner (1024 iterations).

I can't understand it.

DennisterBeest

123

March 08, 2012 08:52 AM

Could you post some code please?

DennisterBeest

123

March 08, 2012 08:57 AM

with googling around i found 2 answers:

X3553:

[color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif]
OK, this is a bit difficult to explain. First, the reason why the second one works, is because it is automatically unrolled. You loop a constant amount of iteration, so the compiler can silently unroll without side effects.
[/font]

[color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif]
Now, the reason the first doesn't work as expected, is due to the fact that derivatives (as used by a gradient instruction, such as tex2D) are undefined within a conditional statement (such as the one implicitly used by the loop).
[/font]

[color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif]
You can solve this by either not using derivatives at all (ie. supplying a constant LOD through tex2DLod rather than the tex2D) or by manually computing the derivatives outside of the loop, and supplying them explicitly. If you only want to blur a screen aligned rectangle, then the best method is the former one.
[/font]

[color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif]
[/quote]
[/font]
[color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif]
resource:
[/font]http://www.gamedev.net/topic/526047-bizzare-shader-error/

X3511:

The compiler needs to prove that the loop will terminate within 1024
iterations to target the ps_2_0 profile. Since your index value comes from
a global, no range information can be infered beyond the 32-bit int type.
You can force this by doing something like:

int x = gNum % 1023;
for( int i = 0; i < x; i++ ) ...
[/quote]
resource: http://www.ureader.com/msg/146533.aspx

hope this helps

xujiezhige

Author

March 08, 2012 09:00 AM



float4 ps_POM( float2 iTex : TEXCOORD0,

	  float3 iLightTS : TEXCOORD1,

	  float3 iViewTS : TEXCOORD2,

	  float2 iDisplaceTS : TEXCOORD3,

	  float3 iViewWS : TEXCOORD4,

	  float3 iNormalWS : TEXCOORD5 ) : COLOR

{

iLightTS = normalize( iLightTS );

iViewTS = normalize( iViewTS );

iViewWS = normalize( iViewWS );

iNormalWS = normalize( iNormalWS );

float2 BigTexCoord = iTex * TEXTURE_DIM;

float2 ddxBig, ddyBig, ddxSmall, ddySmall;

ddxBig = ddx( BigTexCoord );

ddyBig = ddy( BigTexCoord );

ddxSmall = ddx( iTex );

ddySmall = ddy( iTex );

float2 Delta = ddxBig * ddxBig + ddyBig * ddyBig;

float fMaxTexDelta = max( Delta.x, Delta.y );

float fMipLevel = max( 0, 0.5f * log2( fMaxTexDelta ) );

float fOcclusionShadow = 1;

//if the mesh is close enough to view, implement POM

if( fMipLevel <= MIPLEVEL_THRESHOLD )

{

  float fNumSamples = lerp( MAX_SAMPLES, MIN_SAMPLES, dot( iNormalWS, iViewWS) );

  float fStepSize = 1.0f / fNumSamples;

  float2 fStepDisplace = fStepSize * iDisplaceTS;

  float fCurrentSample = 0;

  float fCurrentBound = 1.0f;

  float fCurrentHeight = 0;

  float fPrevHeight = tex2D( g_samNMH, iTex ).a;

  float fPrevBound = 1.0f;

  while( fCurrentSample < fNumSamples )

  {

   iTex -= fStepDisplace;

   fCurrentBound -= fStepSize;

[color=#ff0000]fCurrentHeight = tex2D( g_samNMH, iTex ).a;

   if( fCurrentBound <  fCurrentHeight )

   {

    fCurrentSample = fNumSamples + 1;

   }

   else

   {

    fPrevHeight = fCurrentHeight;

    fPrevBound = fCurrentBound;

    fCurrentSample++;

   }

  }

  float fHeightDelta1 = fPrevBound - fPrevHeight;

  float fHeightDelta2 = fCurrentHeight - fCurrentBound;

  float fIntersectionStepFraction = fHeightDelta2 / ( fHeightDelta1 + fHeightDelta2 );

  iTex += fIntersectionStepFraction * fStepDisplace;



  if( BE_SHADOW )

  {

   float2 LightTS = iLightTS.xy * HEIGHT_SCALING;

   float sh0 = tex2Dgrad( g_samNMH, iTex, ddxSmall, ddySmall ).a;

   float shA = ( tex2Dgrad( g_samNMH, iTex + LightTS * 0.88f, ddxSmall, ddySmall ).a - sh0 - 0.88 ) *

    1 *SOFTSHADOW_FACTOR;

   float sh9 = ( tex2Dgrad( g_samNMH, iTex + LightTS * 0.77f, ddxSmall, ddySmall ).a - sh0 - 0.77 ) *

    2 * SOFTSHADOW_FACTOR;

   float sh8 = ( tex2Dgrad( g_samNMH, iTex + LightTS * 0.66f, ddxSmall, ddySmall ).a - sh0 - 0.66 ) *

    4 * SOFTSHADOW_FACTOR;

   float sh7 = ( tex2Dgrad( g_samNMH, iTex + LightTS * 0.55f, ddxSmall, ddySmall ).a - sh0 - 0.55 ) *

    6 * SOFTSHADOW_FACTOR;

   float sh6 = ( tex2Dgrad( g_samNMH, iTex + LightTS * 0.44f, ddxSmall, ddySmall ).a - sh0 - 0.44 ) *

    8 * SOFTSHADOW_FACTOR;

   float sh5 = ( tex2Dgrad( g_samNMH, iTex + LightTS * 0.33f, ddxSmall, ddySmall ).a - sh0 - 0.33 ) *

    10 * SOFTSHADOW_FACTOR;

   float sh4 = ( tex2Dgrad( g_samNMH, iTex + LightTS * 0.22f, ddxSmall, ddySmall ).a - sh0 - 0.22 ) *

    12 * SOFTSHADOW_FACTOR;



   fOcclusionShadow = 1 - max( max( max( max( max( max( shA, sh9 ), sh8 ), sh7), sh6 ), sh5), sh4 );



   fOcclusionShadow = fOcclusionShadow * 0.3 + 0.4;

  }



  //fOcclusionShadow = 1;

}

return GetIlluminance( iTex, iLightTS, iViewTS, fOcclusionShadow );

}

The red bold line is the error line.

xujiezhige

Author

March 08, 2012 09:17 AM

Thank you, Ghosrath.

I get it.

GPUs shade a 'quad' of 4 pixels at a time and the gradients used for texture fetches come from the finite differences calculated from adjacent pairs of pixels. This is how the GPU is able to generate partial derivatives even for arbitrary expressions in the pixel shader. The tex2Dgrad function can be useful if you can calculate more accurate analytical derivatives for the values you are passing in as texture coordinates.
[/quote]

Why can't i use tex2D in loop with HLSL?

OK, this is a bit difficult to explain. First, the reason why the second one works, is because it is automatically unrolled. You loop a constant amount of iteration, so the compiler can silently unroll without side effects.
[/font]

[color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif]

[/quote]
[/font]
[color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif]

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Why can't i use tex2D in loop with HLSL?

OK, this is a bit difficult to explain. First, the reason why the second one works, is because it is automatically unrolled. You loop a constant amount of iteration, so the compiler can silently unroll without side effects.[/font] [color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif]

[/quote][/font] [color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif]

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines

[/quote]
[/font]
[color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif]