These two consecutive texture sampling operations causes the performance to drop to about 10 fps for 1024 iterations of the for loop.
If I comment out any of these two texture fetching operations, the frame rate goes to 650 fps!
Is there a way to improve performance?
I pasted a code snippet below from my pixel shader to give you a clearer idea:
[loop]for(uint i=0; i<N_THETA*N_PHI; ++i) { const float3 vEnvSample = g_SampleBuffer.Load(i).rgb; float3 vEnvSampleLocal, vEnvSampleWorld; TransformUniformSample(vEnvSample, TBNInv_G2L, TBN_L2G, vEnvSampleLocal, vEnvSampleWorld, vNormalWorld); float3 envmap = g_EnvMapTexCube.SampleLevel(PointSamplerClamp, vEnvSampleWorld, 0).rgb; // Texture sampling operation 1 float theta_half, phi_half, theta_diff, phi_diff; convert_to_half_diff_coords(vCameraLocal, vEnvSampleLocal, theta_half, phi_half, theta_diff, phi_diff); int3 fIndex; getMERL_textureIndex(theta_half, theta_diff, phi_diff, fIndex); float3 brdf = g_MERL_BRDFTex3D.Load(int4(fIndex.xyz, 0)).rgb; // Texture sampling operation 2 output.rgb += brdf * envmap * dot( vEnvSampleWorld, vNormalWorld ); }