GLSL bad performance?

Started by
7 comments, last by Lord Faron 17 years, 9 months ago
I'm my engine I draw cellshaded models using a GLSL shader. I'm still a beginner in GLSL, so I don't know if my code could be better optimized. My engine has a fallback path, for when GL_ARB_vertex_shader extension is not avaliable. This consists of calculating the shading and textcoords for each vertex, in the CPU. Now the problem is, I created two small demos with the engine, one is using the vertexshader and the other is using the software shader. The demo was tested in various pcs, by our fans, and registered the fps of the two demos. The fallback path is always faster, by about 2 to 10 times! Only in very few cases the GLSL is faster, and even that time, only by 3-4 fps! Following is both GLSL vertex shader code, and Delphi code. Both produce exactly same output, but the Delphi code gets in average 250fps, and the GLSL only 20-30!
uniform vec3 light;
      void main(){
              vec3 shade;
              vec3 normal = normalize(gl_NormalMatrix * gl_Normal);
  	      shade.s = dot(normal,light);
              if (shade.s<0.0) shade.s=0.0;
              gl_TexCoord[0].s = shade.s;
              gl_TexCoord[1] = gl_TextureMatrix[0] * gl_MultiTexCoord0;
              gl_Position = ftransform();
              gl_FrontColor = gl_Color;
              vec4 ecPosition = gl_ModelViewMatrix * gl_Vertex;
              gl_FogFragCoord = abs(ecPosition.z);
             }



          If VertexShaderAvaliable Then
          Begin
            glUseProgram(_Program);
            With Renderer.Light Do
              glUniform3f(_ShLight,X,Y,Z);
            Exit;
          End;

          While VertexCount>0 Do
          Begin
            _Normal:=Source.Normal;
            _Vector:=VectorRotate(_Normal,ViewMatrix);
            _Shade:=VectorDot(_Vector, Renderer.Light);	

            If _Shade<0 Then _Shade:=0;

            TexCoord^:=Source.TexCoord;
            Dest.Position:=Source.Position;
            Dest.TexCoord.U:=1.0-_Shade;
            Dest.Normal:=Source.Normal;

            Inc(Source);
            Inc(Dest);
            Inc(TexCoord);
            Dec(VertexCount);
          End;



Extra question: Its wise to clear depth/stencil many times per frame? My engine use various stencil effects for shadows, mirrors and more, and normally I clear the stencil 3 or 4 times per frame. I also clear the depth buffer right before drawing the 2d GUI stuff, plus in each frame start.
Advertisement
Now, I'm definately not a pro at this, but I would try replacing the line
"if (shade.s<0.0) shade.s=0.0;" in your shader with
"shade.s = max(shade.s, 0.0)"

If I remember correctly, the if instruction is pretty slow on older graphics cards, but I'm not 100% sure. It might be worth a shot...

Good luck!
You can optimize the shader code a bit. Try:

uniform vec3 light;void main(){              vec3 shade;// normalization not required - gl_NormalMatrix should not contain any scaling              vec3 normal = gl_NormalMatrix * gl_Normal;// avoid dynamic flow control  	      shade.s = max(dot(normal,light),0.0);              gl_TexCoord[0].s = shade.s;              gl_TexCoord[1] = gl_TextureMatrix[0] * gl_MultiTexCoord0;              gl_Position = ftransform();              gl_FrontColor = gl_Color;              vec4 ecPosition = gl_ModelViewMatrix * gl_Vertex;              gl_FogFragCoord = abs(ecPosition.z);             }


That should give some increase in speed, but if You have slow GFX card (like Radeon X300 or any of GeForace FX familiy) it will always be slow.
if (time() == $) { $ = 0; }
Thank you both. I see, dynamic branching generates ineficient code in most video cards. I can't test it myself, my videocard fried and I'm now forced to use software mode, but already reccompiled the test demos and released them in our game forum.

If anyone want to test it, here's the link, and please tell how many fps you get in both versions.
Demo download
The server asks for password. I can't download it.
if (time() == $) { $ = 0; }
Strange, none of my users had problems about that, maybe my host block foreign ip adresses, or something.
I just uploaded it to Megaupload/Rapidshare.
Megaupload
Rapidshare
On my Radeon 9600 Your demo runs with GLSL from 7 to 560 FPS, without GLSL - from about 310 to 650 FPS.
if (time() == $) { $ = 0; }
Hmmm, so the software versions still runs faster? I don't understand how that can be, since I'm copying the vertices into a buffer and transforming every vertex per frame on the CPU, while the GLSL all the work is done in the GPU.

The only thing I can think of is that since I'm only using a GLSL vertex shader and not a fragment shader, letting the fixed function calculate all the pixel stuff, can this have any impact in the performance?
Could be... I use GLSL all the time. Here is a screenshot from my engine. 3 lights (red, green, blue) + about 5000 verices + dynamic soft shadows + bump mapping at 17-20 FPS.

Free Image Hosting at www.ImageShack.us
if (time() == $) { $ = 0; }

This topic is closed to new replies.

Advertisement