Sign in to follow this  
LargeJ

Frame rate drops to 1 when adding spotlights

Recommended Posts

LargeJ    126
Hello all,
I am working on a lighting shader in GLSL in which I want to compute spot, direction and point lights. I compute everything in a single pass. The shader is not that efficient though, but my fps is still around 500 fps so I'm able to see the results. However, when I added spotlights the frame rate drops to 0-1 FPS showing a white screen, then a correctly rendered scene, white screen again, etc. It feels as if it's in an infinite loop or something.

In my scene I have a plane and a teapot. If I render 3 lights (a spot, directional and point light) on the plane, everything works fluently. When computing the lighting effects on the teapot, I get this problem. This also occurs when I only render the teapot (and not the plane). However, when I render 3 lights on the plane and 2 lights on the teapot (in which case I don't compute the spot light) the fps is also reasonable. However when I only compute the spot light, there are also no problems.

My way of deciding which lighting to compute is done using an 'if' statement in the fragment shader, like this:

[code]struct Light
{
//position of light (if w == 0.0, then it's a direction for the directional light)
vec4 Position;
vec3 Color;
vec3 Attenuation;

//spotlight part.
//if spot cutoff equals -1 (equals cos(180)), then it's a point light, otherwise it's a spotlight
float SpotCutoff;
float SpotExponent;
vec3 SpotDirection;
};

void main ()
{
vec3 surfaceNormal = normalize(Normal);

vec3 ColorOut = vec3(0.0f, 0.0f, 0.0f);

for( int i=0; i<LightCount; ++i ){
//Determine kind of light
if( LightSources[i].Position.w == 0.0f )
ColorOut += DirectionalLight(i, surfaceNormal, LightVectors[i], normalize(HalfVectors[i]));
else{
float lightDistance = length(LightVectors[i]);
vec3 lightDirection = LightVectors[i] / lightDistance;

if( LightSources[i].SpotCutoff == -1.0f )
ColorOut += PointLight( i, surfaceNormal, lightDirection, normalize(HalfVectors[i]), lightDistance );
else
ColorOut += SpotLight(i, surfaceNormal, lightDirection, normalize(HalfVectors[i]), normalize( SpotDirections[i] ), lightDistance);
}
}

gl_FragColor = vec4(ColorOut + 0.05 * AmbientColor + EmissionColor, 0.0f);
}[/code]

By searching this forum I found a possible explanation that the video card does not support branching at the fragment level. However branching works for point and direction lights and I guess my video card supports branching. I'm using GLSL version 1.3 and have a Nvidia GeForce 9800M GS.

Does anyone know what the cause of this problem might be?

Share this post


Link to post
Share on other sites
mhagain    13430
This kind of performance drop is normally a quite reliable indication that something in the per-fragment pipeline is dropping back to software emulation; you might be overflowing the maximum instruction count of your hardware and causing it.

Share this post


Link to post
Share on other sites
LargeJ    126
Thanks, I guess you are right. I changed the #define MAX_LIGHTS to 3 and it runs fluently again.
So if I want to add more lights (say 10 lights), then there's no other way of doing it than using multiple render passes, assuming that I don't precompute the lighting values?

Share this post


Link to post
Share on other sites
mhagain    13430
There seems to be room for simplification in your shader. For example, you're normalizing HalfVectors for all light types, so why not just normalize once on the CPU and send the normalized version as a uniform? That would perform better in the general case too. Likewise the surface normal, and it seems as though lightDistance and lightDirecttion could also become uniforms. That might help you get the instruction count down.

Share this post


Link to post
Share on other sites
L. Spiro    25638
You have a lot of if/else. This is a very bad idea.


Organize your lights so that the first X are directional lights, the following X lights are spot lights, and the following X are point lights.
Then add one for loop for each type of light, using an index counter that is shared between the loops.
This way each loop knows what type of light it is getting and there is no need for if/else “branching”.


In regards to some of the bits suggested by mhagain, I would actually not recommend determining the half vectors once on CPU and sending them.
You would need one per directional light, so you would have to store them in an array. Accessing the array is actually slower than recalculating it, so it turns out that, unless you hardcode one directional light and use only one half vector to avoid array access, it is actually faster to recalculate it.
Secondly, by sending one half vector to every directional light (note that for any other light it [b]must[/b] be calculated every time), you are assuming the viewer to be infinitely far away from the pixels being lit, which results in much less impressive lighting.


L. Spiro

Share this post


Link to post
Share on other sites
LargeJ    126
Thanks for the advices. Sorting lights sounds pretty obvious to make it faster. Should have thought about that myself :)

[quote name='YogurtEmperor' timestamp='1313466891' post='4849685']
Accessing the array is actually slower than recalculating it.
[/quote]

Right now I perform some calculations on the vertex shader and store the results in an array, like half vectors, eye vectors etcetera. So what you are saying is that because array access is slower, I can better do all "simple vector" computations on the fragment shader?

Share this post


Link to post
Share on other sites
mhagain    13430
I'd counterargue that "it depends". ;)

Array access [i]might[/i] be slower than just doing the calculation direct (depending on the complexity of the calculation and how well your GPU does it), but it will also get your fragment shader instruction count down and move some calculations from per-fragment to per-vertex. That's a saving which - on balance - might work out either faster or slower overall (if it stops you from dropping to software emulation it's always going to be faster!), and that's something you're going to need to benchmark.

That's the weird thing about certain kinds of optimization - sometimes you need to take an acceptable performance hit in one place in exchange for a performance gain in another. If the gain outweighs the hit then you've a net gain, so all is well. Like I said, benchmark.

Share this post


Link to post
Share on other sites
L. Spiro    25638
[quote name='LargeJ' timestamp='1313519614' post='4849964']Right now I perform some calculations on the vertex shader and store the results in an array, like half vectors, eye vectors etcetera. So what you are saying is that because array access is slower, I can better do all "simple vector" computations on the fragment shader?
[/quote]
You can’t really avoid array access entirely without severely increasing the number of permutations.
But you should consider what you can avoid putting in arrays, while at the same time testing for yourself if it improved the speed.


[quote name='mhagain' timestamp='1313528104' post='4850041']
I'd counterargue that "it depends". ;)[/quote]
And that is true. I tested this on my main machine, which has entirely new hardware, but I only tested on that machine.
Every change should be benchmarked on your target platform, as results may vary.

Here are some other general performance tips:
http://lspiroengine.com/?p=96


L. Spiro

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this