Jump to content

  • Log In with Google      Sign In   
  • Create Account

We're offering banner ads on our site from just $5!

1. Details HERE. 2. GDNet+ Subscriptions HERE. 3. Ad upload HERE.


Don't forget to read Tuesday's email newsletter for your chance to win a free copy of Construct 2!


Tile based deferred shading light list?


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
1 reply to this topic

#1 theagentd   Members   -  Reputation: 602

Like
0Likes
Like

Posted 24 January 2014 - 08:59 AM

Hello.

 

I just finished an implementation of tile based deferred shading using only OpenGL 3.2, but I encountered some problems along the way that I had to hack around.

 

First I generate tile frustums and store them in a GL_RGBA32F 2D texture array with 6 layers. The actual culling and lighting is done in the geometry shader and the fragment shader respectively. Culling is done in the geometry shader like this:

uniform vec3 lightWorldPositions[MAX_LIGHTS];
uniform float lightIntensities[MAX_LIGHTS];
uniform int numLights;

flat out float visible[MAX_LIGHTS];

...


    for(int i = 0; i < MAX_LIGHTS && i < numLights; i++){
        
        vec4 lightPos = vec4(lightWorldPositions[i], 1.0);
        float radius = lightIntensities[i];
        
        bool v = true;
        for(int p = 0; p < 6; p++){
            if(dot(planes[p], lightPos) < -radius){
                v = false;
            }
        }
        render = render || v;
        visible[i] = float(v);
    }
    
    if(!render){
        return;
    }
    
    vec2 pos0 = tileCoord * tileSize / screenSize * 2 - 1;
    vec2 pos1 = (tileCoord + 1) * tileSize / screenSize * 2 - 1;
    
    gl_Position = vec4(pos0, 0, 1);
    EmitVertex();
    
    gl_Position = vec4(pos0.x, pos1.y, 0, 1);
    EmitVertex();
    
    gl_Position = vec4(pos1.x, pos0.y, 0, 1);
    EmitVertex();
    
    gl_Position = vec4(pos1, 0, 1);
    EmitVertex();
    
    EndPrimitive();
 

Here I encountered a number of problems. My first idea was to generate a list of light IDs and output it as an int[], but that turned out to be impossible in GLSL ("lvalue in assignment too complex") (http://www.gamedev.net/topic/518271-are-shaders-really-that-limited/). I ended up with simply marking each light as visible or not visible, but since bool arrays aren't supported I had to make it a float array. Finally, if at least one light was visible in this tile (render = true) I output a quad for the tile. The fragment shader then simply loops through the lights and draws all that are visible:

#pragma optionNV(unroll all)
	for(int i = 0; i < MAX_LIGHTS && i < numLights; i++){
		if(visible[i] == 0.0){
			continue;
		}
	
		int index = i;

		vec3 dPos = lightEyePositions[index] - eyeSpace.xyz;
		
		vec3 L = normalize(dPos);
		float diffuse = max(dot(N, L), 0.0);
		
		if(diffuse > 0.0){
			
			vec3 H = normalize(L + V);
			
			float specular = pow(max(0.0, dot(N, H)), glossiness);
			
			//Fresnel and intensity
			specular = (specular + (1-specular)*pow(1-dot(V, H), 5)) * specularIntensity;
		
			float distSqrd = dot(dPos, dPos);
			float distance = sqrt(distSqrd);
			float falloff = max(1.0 - distance / lightIntensities[index], 0.0) / distance;
			
			light += lightColors[index] * (diffuseColor + specular) * 
				(lightIntensities[index] * diffuse * falloff);
		}
	} 

Here I encountered a second problem. The visible[] cannot be larger than 32 elements, so I'm limited to 32 lights per pass.

 

Performance with 512 lights:

Tile based shading with 16x16 tiles: 85 FPS.
Tile based shading with 32x32 tiles: 92 FPS.

Traditional deferred shading with depth bounds: 143 FPS.

Traditional deferred shading with stencil marking: 134 FPS.

 

I'm betting that the problems I had to work around are slowing things down considerably. The problem seems to be that I can't build a proper light index list and that I can't render enough lights per pass.

 

 

TL;DR

Question: How do I build a light index list if stream processors do not support "dynamically addressed scattered writes"? Would switching to OpenCL even help if this is a hardware limitation?


Edited by theagentd, 24 January 2014 - 09:08 AM.


Sponsor:

#2 Yours3!f   Members   -  Reputation: 1385

Like
0Likes
Like

Posted 24 January 2014 - 06:33 PM

on amd use opencl, on nvidia use glsl compute shaders. you'll probably see this fps value go higher ;) if anything you'll learn how to do compute.

there are a number of options there though. you can go full deferred, or just do the culling in the compute shaders and do forward+ etc.






Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS