Why is it sOooOOoo slow?

Started by
6 comments, last by mrbig 18 years, 3 months ago
I was trying to write a very simple box blur shader, but it turned out to be awfully slow (3 FPS max), and I really can't imagine why... I just can't think of a more simple and fast way to create a blur effect. Here's the code:

uniform sampler2D texture;
uniform float width, height; 

const float step_w = 1.0 / width;
const float step_h = 1.0 / height;

void main()
{     
 vec4 sum = vec4(0.0, 0.0, 0.0, 0.0);
 float radius = 4.0, x, y;
 float num = 0.0, cx, cy;

 for(y = -radius; y < radius; y++)
  for(x = -radius; x < radius; x++)
   {
    cx = x * step_w + gl_TexCoord[0].st[0];
    cy = y * step_h + gl_TexCoord[0].st[1];

    if(cx >= 0.0 && cy >= 0.0 && cx <= 1.0 && cy <= 1.0)
     {
      sum += texture2D(texture, vec2(cx, cy));
      num += 1.0;
     }
    }
 
 gl_FragColor = sum / num;
}

Here's an even faster implementation, but this one doesn't seem to work at all!

uniform sampler2D texture;
uniform float step_w, step_h; 

void main()
{     
 vec4 sum = vec4(0.0, 0.0, 0.0, 0.0);
 float radius = 4.0, x, y, minx, miny, maxx, maxy;
 float num = 0.0, cx, cy;

 minx = -radius * step_w;
 miny = -radius * step_h;

 maxx = -minx;
 maxy = -miny;

 for(y = miny; y < maxy; y += step_h)
  for(x = minx; x < maxx; x += step_w)
   {
    cx = x + gl_TexCoord[0].st[0];
    cy = y + gl_TexCoord[0].st[1];

    if(cx >= 0.0 && cy >= 0.0 && cx <= 1.0 && cy <= 1.0)
     {
      sum += texture2D(texture, vec2(cx, cy));
      num += 1.0;
     }
    }
 
 gl_FragColor = sum / num;
}

Why is the first shader so slow, and why doesn't the second work at all? O_O
Advertisement
are the textures you are using to the power of 2?
Yes.
It's a small 128x256 texture.
The videocard i'm using is GeForce FX 5700VE.
I'm not surprised it's slow, you are doing 64 texture reads per pixel.

Don't test against the borders of the image; use a clamp mode instead
(it won't quite look the same but should be good enough and much faster)

If you are doing this fullscreen you can split your blur into multiple passes and save texture reads. One way is to use render-to-texture and blur horizontally first to a second buffer, then blur the second buffer vertically.

"Don't test against the borders of the image; use a clamp mode instead"
I removed that 'if' statement from there and just left it the way it is without it.
Now it's faster and looks exactly the same. Go figure...
"you can split your blur into multiple passes"
Yes, i've already thought about that, as well as about a couple of more speedups.
I'll see if I can get it to work properly now...
I've heard something about that if statements are not hardware supported (or something similar) in shaders etc... that could be why it's slow?
"Game Maker For Life, probably never professional thou." =)
that depends on the hardware in use, on any SM3.0 capible hardware it should be fine, however on PS2.0 hardware its either thrown back into software or tricks are done to get the right values (or it simply wont compile)
Forget about the 'if'.
I removed it and it works just as well.

This topic is closed to new replies.

Advertisement