Jump to content
  • Advertisement
Sign in to follow this  

Why can't i use tex2D in loop with HLSL?

This topic is 2325 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts


When i want to use tex2D in loop in PixelShader, the compiler outputs errors:

1.warning X3553: can't use gradient instructions in loops with break, forcing loop to unroll.
2.error X3511: Unable to unroll loop, loop does not appear to terminate in a timely manner (1024 iterations).

I can't understand it.

Share this post

Link to post
Share on other sites
with googling around i found 2 answers:


[color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif]

OK, this is a bit difficult to explain. First, the reason why the second one works, is because it is automatically unrolled. You loop a constant amount of iteration, so the compiler can silently unroll without side effects.


[color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif]

Now, the reason the first doesn't work as expected, is due to the fact that derivatives (as used by a gradient instruction, such as tex2D) are undefined within a conditional statement (such as the one implicitly used by the loop).


[color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif]

You can solve this by either not using derivatives at all (ie. supplying a constant LOD through tex2DLod rather than the tex2D) or by manually computing the derivatives outside of the loop, and supplying them explicitly. If you only want to blur a screen aligned rectangle, then the best method is the former one.


[color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif]


[color=#282828][font=helvetica, arial, verdana, tahoma, sans-serif]




The compiler needs to prove that the loop will terminate within 1024
iterations to target the ps_2_0 profile. Since your index value comes from
a global, no range information can be infered beyond the 32-bit int type.
You can force this by doing something like:

int x = gNum % 1023;
for( int i = 0; i < x; i++ ) ...
resource: http://www.ureader.com/msg/146533.aspx

hope this helps :)

Share this post

Link to post
Share on other sites

float4 ps_POM( float2 iTex : TEXCOORD0,
float3 iLightTS : TEXCOORD1,
float3 iViewTS : TEXCOORD2,
float2 iDisplaceTS : TEXCOORD3,
float3 iViewWS : TEXCOORD4,
float3 iNormalWS : TEXCOORD5 ) : COLOR
iLightTS = normalize( iLightTS );
iViewTS = normalize( iViewTS );
iViewWS = normalize( iViewWS );
iNormalWS = normalize( iNormalWS );
float2 BigTexCoord = iTex * TEXTURE_DIM;
float2 ddxBig, ddyBig, ddxSmall, ddySmall;
ddxBig = ddx( BigTexCoord );
ddyBig = ddy( BigTexCoord );
ddxSmall = ddx( iTex );
ddySmall = ddy( iTex );
float2 Delta = ddxBig * ddxBig + ddyBig * ddyBig;
float fMaxTexDelta = max( Delta.x, Delta.y );
float fMipLevel = max( 0, 0.5f * log2( fMaxTexDelta ) );
float fOcclusionShadow = 1;
//if the mesh is close enough to view, implement POM
float fNumSamples = lerp( MAX_SAMPLES, MIN_SAMPLES, dot( iNormalWS, iViewWS) );
float fStepSize = 1.0f / fNumSamples;
float2 fStepDisplace = fStepSize * iDisplaceTS;
float fCurrentSample = 0;
float fCurrentBound = 1.0f;
float fCurrentHeight = 0;
float fPrevHeight = tex2D( g_samNMH, iTex ).a;
float fPrevBound = 1.0f;
while( fCurrentSample < fNumSamples )
iTex -= fStepDisplace;
fCurrentBound -= fStepSize;
[color=#ff0000]fCurrentHeight = tex2D( g_samNMH, iTex ).a;
if( fCurrentBound < fCurrentHeight )
fCurrentSample = fNumSamples + 1;
fPrevHeight = fCurrentHeight;
fPrevBound = fCurrentBound;
float fHeightDelta1 = fPrevBound - fPrevHeight;
float fHeightDelta2 = fCurrentHeight - fCurrentBound;
float fIntersectionStepFraction = fHeightDelta2 / ( fHeightDelta1 + fHeightDelta2 );
iTex += fIntersectionStepFraction * fStepDisplace;

float2 LightTS = iLightTS.xy * HEIGHT_SCALING;
float sh0 = tex2Dgrad( g_samNMH, iTex, ddxSmall, ddySmall ).a;
float shA = ( tex2Dgrad( g_samNMH, iTex + LightTS * 0.88f, ddxSmall, ddySmall ).a - sh0 - 0.88 ) *
float sh9 = ( tex2Dgrad( g_samNMH, iTex + LightTS * 0.77f, ddxSmall, ddySmall ).a - sh0 - 0.77 ) *
float sh8 = ( tex2Dgrad( g_samNMH, iTex + LightTS * 0.66f, ddxSmall, ddySmall ).a - sh0 - 0.66 ) *
float sh7 = ( tex2Dgrad( g_samNMH, iTex + LightTS * 0.55f, ddxSmall, ddySmall ).a - sh0 - 0.55 ) *
float sh6 = ( tex2Dgrad( g_samNMH, iTex + LightTS * 0.44f, ddxSmall, ddySmall ).a - sh0 - 0.44 ) *
float sh5 = ( tex2Dgrad( g_samNMH, iTex + LightTS * 0.33f, ddxSmall, ddySmall ).a - sh0 - 0.33 ) *
float sh4 = ( tex2Dgrad( g_samNMH, iTex + LightTS * 0.22f, ddxSmall, ddySmall ).a - sh0 - 0.22 ) *

fOcclusionShadow = 1 - max( max( max( max( max( max( shA, sh9 ), sh8 ), sh7), sh6 ), sh5), sh4 );

fOcclusionShadow = fOcclusionShadow * 0.3 + 0.4;

//fOcclusionShadow = 1;
return GetIlluminance( iTex, iLightTS, iViewTS, fOcclusionShadow );

The red bold line is the error line.

Share this post

Link to post
Share on other sites
Thank you, Ghosrath.

I get it.

GPUs shade a 'quad' of 4 pixels at a time and the gradients used for texture fetches come from the finite differences calculated from adjacent pairs of pixels. This is how the GPU is able to generate partial derivatives even for arbitrary expressions in the pixel shader. The tex2Dgrad function can be useful if you can calculate more accurate analytical derivatives for the values you are passing in as texture coordinates.

Share this post

Link to post
Share on other sites
Sign in to follow this  

  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!