DirectX texture fetch optimization

Started by
3 comments, last by matthew007101 13 years, 6 months ago
I have a pixel shader with a loop which runs for a fixed number of times for all pixels. Inside this loop I perform two consecutive texture sampling operations.

These two consecutive texture sampling operations causes the performance to drop to about 10 fps for 1024 iterations of the for loop.

If I comment out any of these two texture fetching operations, the frame rate goes to 650 fps!

Is there a way to improve performance?

I pasted a code snippet below from my pixel shader to give you a clearer idea:

    [loop]for(uint i=0; i<N_THETA*N_PHI; ++i)    {        const float3 vEnvSample = g_SampleBuffer.Load(i).rgb;        float3 vEnvSampleLocal, vEnvSampleWorld;        TransformUniformSample(vEnvSample, TBNInv_G2L, TBN_L2G, vEnvSampleLocal, vEnvSampleWorld, vNormalWorld);        float3 envmap = g_EnvMapTexCube.SampleLevel(PointSamplerClamp, vEnvSampleWorld, 0).rgb; // Texture sampling operation 1        float theta_half, phi_half, theta_diff, phi_diff;        convert_to_half_diff_coords(vCameraLocal, vEnvSampleLocal, theta_half, phi_half, theta_diff, phi_diff);        int3 fIndex;        getMERL_textureIndex(theta_half, theta_diff, phi_diff, fIndex);        float3 brdf = g_MERL_BRDFTex3D.Load(int4(fIndex.xyz, 0)).rgb; // Texture sampling operation 2        output.rgb += brdf * envmap * dot( vEnvSampleWorld, vNormalWorld );    }
Advertisement
My first suggestion would be to compile all three versions of the shader with fxc.exe, and look at the disassembly for each. You'll probably find it's hard to take out one texture read without affecting the rest of the code that gets generated for that loop.

Also note that reading one texture based on another one can be significantly slower than two independent texture reads.
I don't think you're ever going to get good performance with 2048 texture samples in a pixel shader.
Quote:Original post by MJP
I don't think you're ever going to get good performance with 2048 texture samples in a pixel shader.
QFT

Commenting out one read might leave the other texture fully in the cache which would explain the vast difference between your 1x and 2x sample versions.

tbh I'm amazed you're seeing 650 fps with just the one of them. Nice job! :)

------------------------------Great Little War Game
I render my geometry, a simple geo-sphere, only once, and the rest is deferred shading. so it's fast!

More interesting is the fact that when I set my texture sizes to a very low value, such as 16x16, and set the bit depth to lowest possible, the frame rate is still around 10fps.

I even replaced the 3d and cube texture with two simple 2D texture, but nothing changed. :(

This topic is closed to new replies.

Advertisement