Hello, so I have done shader that has to pick texture from atlas and do repeat or mirrored repeat. Nothing else. The texture param is passed as attribute. Then there is an const array that has 4*(count of textures) size, and there is starting xy and size xy for each one. Technically it works fine. Also, on Nvidia 750 it works fast.
But, on Nvidia 820 (worse) there is slowdown so brutal that I got TDR! :D So I tried to do a reduced version of shader, just with few textures. Array of 128 worked, lets say, on normal speed. I added next 4 parts of array as separate arrays (I have 128 textures, so it was 512 originally). Speed was OK. But when I started to use them and switch them like - for 0-31 use first, for 32-63 second and so, it started to be again SLOW! So my guess is that compiler just optimized out unused variables before.
My second guess is that while Nvidia 750 allocates const variables in shaders only once, Nvidia 750 does it on every run. But this is just my guess. Also, I know that ifs in shaders are slow, but certainly not that slow!
The code for original shader that was working on 750 but is slow on 820:
attribute float param;
attribute vec4 barva;
varying vec4 myUniform;
uniform sampler2D texture;
varying vec2 uv_coords;
varying vec3 vertex_light_position;
const float tex_data[508] = float[508](
1.,1.,256.,256.,
...
0.,0.,1.,1.
);
void main()
{
float nasobek=2.0;
float druhy_nasobek = 1.0;
float prevrat = -1.0;
float odsunuti =2.0;
float odsunuti2 =4.0;
int texnum = round(param);
int num1 = (texnum*4);
int num2 = (texnum*4)+1;
int num3 = (texnum*4)+2;
int num4 = (texnum*4)+3;
myUniform.x=(tex_data[num1]+odsunuti)/4096.;myUniform.y=(tex_data[num2]+odsunuti)/4096.;
myUniform.z=(tex_data[num3]-odsunuti2)/4096.;myUniform.w=(tex_data[num4]-odsunuti2)/4096.;
uv_coords.xy=fract(uv_coords.xy);
if(texnum<67)
{
if(mod(uv_coords.x,1.0)>0.5)
{
uv_coords.x=druhy_nasobek*(1.0-uv_coords.x);
}
if(mod(uv_coords.y,1.0)>0.5)
{
uv_coords.y=druhy_nasobek*(1.0-uv_coords.y);
}
}
uv_coords.xy*=(myUniform.zw)*nasobek;
uv_coords.xy = mod(uv_coords.xy, myUniform.zw);
uv_coords.xy += myUniform.xy;
float diffuse_value = 20.0*(gl_DepthRange.far-gl_FragCoord.z);
vec4 color = texture2D(texture, uv_coords);
gl_FragColor = color*barva;//* diffuse_value;
}
Specs of both gpus:
https://www.notebookcheck.net/NVIDIA-GeForce-820M.108477.0.html
https://www.notebookcheck.net/NVIDIA-GeForce-GT-750M.90245.0.html
With which spec this has something to do? I guess there must be some better way how to load those consts. Please help?