Advertisement Jump to content
Sign in to follow this  
theagentd

OpenGL Tile based deferred shading light list?

This topic is 1822 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello.

 

I just finished an implementation of tile based deferred shading using only OpenGL 3.2, but I encountered some problems along the way that I had to hack around.

 

First I generate tile frustums and store them in a GL_RGBA32F 2D texture array with 6 layers. The actual culling and lighting is done in the geometry shader and the fragment shader respectively. Culling is done in the geometry shader like this:

uniform vec3 lightWorldPositions[MAX_LIGHTS];
uniform float lightIntensities[MAX_LIGHTS];
uniform int numLights;

flat out float visible[MAX_LIGHTS];

...


    for(int i = 0; i < MAX_LIGHTS && i < numLights; i++){
        
        vec4 lightPos = vec4(lightWorldPositions[i], 1.0);
        float radius = lightIntensities[i];
        
        bool v = true;
        for(int p = 0; p < 6; p++){
            if(dot(planes[p], lightPos) < -radius){
                v = false;
            }
        }
        render = render || v;
        visible[i] = float(v);
    }
    
    if(!render){
        return;
    }
    
    vec2 pos0 = tileCoord * tileSize / screenSize * 2 - 1;
    vec2 pos1 = (tileCoord + 1) * tileSize / screenSize * 2 - 1;
    
    gl_Position = vec4(pos0, 0, 1);
    EmitVertex();
    
    gl_Position = vec4(pos0.x, pos1.y, 0, 1);
    EmitVertex();
    
    gl_Position = vec4(pos1.x, pos0.y, 0, 1);
    EmitVertex();
    
    gl_Position = vec4(pos1, 0, 1);
    EmitVertex();
    
    EndPrimitive();
 

Here I encountered a number of problems. My first idea was to generate a list of light IDs and output it as an int[], but that turned out to be impossible in GLSL ("lvalue in assignment too complex") (http://www.gamedev.net/topic/518271-are-shaders-really-that-limited/). I ended up with simply marking each light as visible or not visible, but since bool arrays aren't supported I had to make it a float array. Finally, if at least one light was visible in this tile (render = true) I output a quad for the tile. The fragment shader then simply loops through the lights and draws all that are visible:

#pragma optionNV(unroll all)
	for(int i = 0; i < MAX_LIGHTS && i < numLights; i++){
		if(visible[i] == 0.0){
			continue;
		}
	
		int index = i;

		vec3 dPos = lightEyePositions[index] - eyeSpace.xyz;
		
		vec3 L = normalize(dPos);
		float diffuse = max(dot(N, L), 0.0);
		
		if(diffuse > 0.0){
			
			vec3 H = normalize(L + V);
			
			float specular = pow(max(0.0, dot(N, H)), glossiness);
			
			//Fresnel and intensity
			specular = (specular + (1-specular)*pow(1-dot(V, H), 5)) * specularIntensity;
		
			float distSqrd = dot(dPos, dPos);
			float distance = sqrt(distSqrd);
			float falloff = max(1.0 - distance / lightIntensities[index], 0.0) / distance;
			
			light += lightColors[index] * (diffuseColor + specular) * 
				(lightIntensities[index] * diffuse * falloff);
		}
	} 

Here I encountered a second problem. The visible[] cannot be larger than 32 elements, so I'm limited to 32 lights per pass.

 

Performance with 512 lights:

Tile based shading with 16x16 tiles: 85 FPS.
Tile based shading with 32x32 tiles: 92 FPS.

Traditional deferred shading with depth bounds: 143 FPS.

Traditional deferred shading with stencil marking: 134 FPS.

 

I'm betting that the problems I had to work around are slowing things down considerably. The problem seems to be that I can't build a proper light index list and that I can't render enough lights per pass.

 

 

TL;DR

Question: How do I build a light index list if stream processors do not support "dynamically addressed scattered writes"? Would switching to OpenCL even help if this is a hardware limitation?

Edited by theagentd

Share this post


Link to post
Share on other sites
Advertisement

on amd use opencl, on nvidia use glsl compute shaders. you'll probably see this fps value go higher ;) if anything you'll learn how to do compute.

there are a number of options there though. you can go full deferred, or just do the culling in the compute shaders and do forward+ etc.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!