Jump to content
  • Advertisement
Sign in to follow this  
gearifysoftware

Simple point light fragment shader very slow

This topic is 502 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I have a deferred rendering system, and I am testing at full 1920x1080 and noticing that a particular fragment shader is causing significant slowdown (they all seem oddly slow, but this one seems the most significant). My shader is based on the OGL-Dev tutorials for deferred rendering.

The purpose of this shader is to calculate the added light within a light's area of effect. There are 3 large point lights in my scene, that cover everything in view. So essentially this shader is executing 3 times for every pixel on the screen.

I measured the difference in frame time between running this shader, and then replacing the shader's main function with a simple "FragColor = vec4(1,0,0,1); //red". The total difference in time (for my engine to render an entire frame) is 4-5 milliseconds. If I'm shooting for 60 fps, that's already 25+% of my full frame render time, which seems kind of crazy (I removed shadow calculation and blur before taking these measurements).

Here is my shader:

#version 420

layout (location = 0) out vec4 FragColor;

struct BaseLight
{
    vec3  Color;
    float AmbientIntensity;
    float DiffuseIntensity;
};


struct Attenuation
{
    float Constant;
    float Linear;
    float Exp;
};

struct PointLight
{
    BaseLight Base;
    vec3 Position;
    Attenuation Atten;
};

uniform sampler2D gPositionMap;
uniform sampler2D gColorMap;
uniform sampler2D gNormalMap;
uniform sampler2D gShadowMap;
uniform PointLight gPointLight; 
uniform vec3 gEyeWorldPos;
uniform float gMatSpecularIntensity;
uniform float gSpecularPower;
uniform int gLightType;
uniform vec2 gScreenSize;


uniform mat4 gWVP;
uniform mat4 gVP;
uniform mat4 gView;


vec4 CalcLightInternal(BaseLight Light,
					   vec3 LightDirection,
					   vec3 WorldPos,
					   vec3 Normal, float shadowFactor )
{
    vec4 AmbientColor = vec4(Light.Color, 1.0f) * Light.AmbientIntensity;
    float DiffuseFactor = dot(Normal, -LightDirection);

	float shadowFactorOrig = shadowFactor;

    vec4 DiffuseColor  = vec4(0, 0, 0, 0);
    vec4 SpecularColor = vec4(0, 0, 0, 0);

    if (DiffuseFactor > 0) {

		DiffuseColor = vec4(0,1,0,1);
        DiffuseColor =  vec4(Light.Color, 1.0f)  * DiffuseFactor ;

        vec3 VertexToEye = normalize(gEyeWorldPos - WorldPos);
        vec3 LightReflect = normalize(reflect(LightDirection, Normal));
        float SpecularFactor = dot(VertexToEye, LightReflect);
        SpecularFactor = pow(SpecularFactor, gSpecularPower);
        if (SpecularFactor > 0) {
            SpecularColor = vec4(Light.Color, 1.0f) * gMatSpecularIntensity * SpecularFactor;
        }
    }

	return   (AmbientColor + shadowFactor*(DiffuseColor+SpecularColor));

}

vec4 CalcPointLight(vec3 WorldPos, vec3 Normal)
{
 
    vec3 LightDirection = WorldPos - gPointLight.Position;
    float Distance = length(LightDirection);
    LightDirection = normalize(LightDirection);

    vec4 Color = CalcLightInternal(gPointLight.Base, LightDirection, WorldPos, Normal,1.0f);

    float Attenuation =  gPointLight.Atten.Constant +
                         gPointLight.Atten.Linear * Distance +
                         gPointLight.Atten.Exp * Distance * Distance;

    Attenuation = min(1.0, Attenuation);

    return Color / Attenuation;
}

vec2 CalcTexCoord()
{
    return gl_FragCoord.xy / gScreenSize;
}

void main()
{
    vec2 TexCoord = CalcTexCoord();
	vec3 WorldPos = texture(gPositionMap, TexCoord).xyz;
	vec3 Color = texture(gColorMap, TexCoord).xyz;
	vec3 Normal = texture(gNormalMap, TexCoord).xyz;
	Normal = normalize(Normal);
 
    FragColor =  CalcPointLight(WorldPos, Normal); 



} 

I am on a fairly fast HP machine with plenty of memory and a NVIDIA Quadro K1100M graphics card. Also, I already checked the VSync to make sure my render times are not forcing my render times to be multiples of 16ms. 

This shader does not contain an inordinate amount of looping or branching. It is executing about 3 times per pixel. Should this really be adding 4-5 milliseconds to my render times? 

Any ideas for what could be causing this would be much appreciated.

 

Share this post


Link to post
Share on other sites
Advertisement

Laptops generally have an Intel GPU as well as that NVidia "M" GPU. Your game could be using the wrong graphics adaptor?

Share this post


Link to post
Share on other sites

Aside from what Hodgman said, your branches are doing more harm than good:

if( DiffuseFactor > 0 )
{
   if( SpecularFactor > 0 )
   {
   }
}

Just do DiffuseFactor = max( 0, DiffuseFactor ); and same for the SpecularFactor.

 

Also note those 4ms do not necessarily have to scale linearly with the number of objects. The number of covered pixels affects a lot; and Early Z testing can also amortize the cost a lot.

Share this post


Link to post
Share on other sites

I tried replacing the conditionals with the "max()" function. Marginal improvement, maybe 1 ms less frame time. Something, but doesn't quite give the full story I think.

I double checked, but no, no Intel graphics hardware on this machine. 

I did notice this  https://www.techpowerup.com/gpudb/2430/quadro-k1100m which states:

"We recommend the NVIDIA Quadro K1100M for gaming with highest details at resolutions up to, and including, 1024x768."

I am wondering if this card just can't handle those resolutions... still seems unacceptably slow

Share this post


Link to post
Share on other sites

Thanks for the analysis. That's a relief that I can (at least partially) blame my hardware!

I guess this calls into question whether or not I want to optimize for this machine or aim for a stronger graphics card. I am shooting for a game that doesn't have to be on a souped up gaming rig, but also doesn't need to run on a dinosaur... something mid range is what I'd like.

I could potentially switch over to my other laptop, which has this card: https://www.techpowerup.com/gpudb/1490/geforce-gt-525m

Could I expect much more out of this one? I see its production status is "End of Life'.

Maybe its just time for a hardware upgrade... 

Share this post


Link to post
Share on other sites

You can also just aim for 30Hz 720p or similarly low resolutions on those older cards, and leave the 60Hz 1080p goal to newer hardware.

Share this post


Link to post
Share on other sites
You can also just aim for 30Hz 720p or similarly low resolutions on those older cards, and leave the 60Hz 1080p goal to newer hardware.

 

I like that! I get way way better performance at 720p and I think 30hz should be very achievable. 

I consider my issue solved. Very much appreciate the help!

Edited by gearifysoftware

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!