Slow point light shadows using cubemaps.

Started by
7 comments, last by LoneDwarf 15 years, 8 months ago
I am having trouble getting my frame rate past 25fps. If I remove the depth clear I get 60fps. I am using a FBO with a 16bit depth and the cubemap is 1024x1024 RGBA8. If I reduce this to 512x512 I get 50fps and 70fps without clears. The clearing is just glClear with depth bit, I have also tried clearing stencil with it. Using a 8500GT with vsync off.

u32	shadowMap = 0;

for( u32 light = 0; light < numLights; light += cm_MaxLights, ++shadowMap )
{
	// These cubemaps are already created.
	ShadowMap*   pMap = shadowMapLevel( shadowMap );

	// FBO is also already created.
	FrameBuffer* pFBO = acquireFrameBuffer( pMap->vSize );

	glBindFramebufferEXT( GL_FRAMEBUFFER_EXT, pFB->m_FrameBuffer );

	// Set projectin
	glMatrixMode( GL_PROJECTION);
	glLoadMatrixf( mProj );

	glViewport( 0, 0, pMap->vSize.x, pMap->vSize.y );
	glColorMask( 1, 1, 1, 1 );
	glClear( GL_COLOR_BUFFER_BIT );

	for( u32 face = 0; face < 6; ++face )
	{
		glFramebufferTexture2DEXT( GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT, GL_TEXTURE_CUBE_MAP_POSITIVE_X + face, pMap->cubeMap, 0 );

		switch( subLight )
		{
			// Each light uses one color channel (like Humus example)
			case 0:	glColorMask( 1, 0, 0, 0 );	break;
			case 1:	glColorMask( 0, 1, 0, 0 );	break;
			case 2:	glColorMask( 0, 0, 1, 0 );	break;
			case 3:	glColorMask( 0, 0, 0, 1 );	break;
		}

		glClear( GL_DEPTH_BUFFER_BIT );
		glMatrixMode( GL_MODELVIEW);
		glLoadMatrixf( MakeCubeMapTransform( pLight->getPos(), face ) );

		// Render effected geom.
	}
}

glColorMask( 1, 1, 1, 1 );

Advertisement
Firstly, if you are using a cubemap frame buffer, shouldn't this line

glFramebufferTexture2DEXT( GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT, GL_TEXTURE_CUBE_MAP_POSITIVE_X + face, pMap->cubeMap, 0 );


have the color attachment incremented for each face?

GL_COLOR_ATTACHMENT0_EXT+face

as each cubemap face should be attached to one of 6 render targets.



Quote:Original post by _Lopez
Firstly, if you are using a cubemap frame buffer, shouldn't this line

*** Source Snippet Removed ***

have the color attachment incremented for each face?

GL_COLOR_ATTACHMENT0_EXT+face

as each cubemap face should be attached to one of 6 render targets.


Thanks for the reply. The code does work, just not as fast as I want. I bind each cubemap face as I need it to the FBO. Maybe you are suggesting I bind all the faces GL_COLOR_ATTACHMENT0_EXT - GL_COLOR_ATTACHMENT5_EXT when I create the cubemap and then switch draw targets using glDrawBuffer GL_COLOR_ATTACHMENT0_EXT + face)? It's worth a try, thanks.

On a side note, I am thinking that maybe I could make this faster by just doing it less. I did a test and it seems that I can reuse the cubemaps and have a huge win. This is likely what people do.
I cache the 4 most visible lights with shadows, and use linear distance computed in the vertex shader. I use 6 x 1024x1024 16bit float textures for each cubemap face, and just bind the cubemap faces using:
glDrawBuffer(GL_COLOR_ATTACHMENT0_EXT + face)
before rendering the occluder geometry.

I think you will get a bit of a speedup doing it this way - With 4 point lights, my framerate was about 60-100fps for forward rendering, but when i switched to deferred rendering i got a substantial speed boost. (8600gt).

Also, I don't know why you need to tranform the modelview matrix for each face.

Just setup a camera at the light position, and use 6 precomuted view, and up vectors.
Quote:Original post by _Lopez
I cache the 4 most visible lights with shadows, and use linear distance computed in the vertex shader. I use 6 x 1024x1024 16bit float textures for each cubemap face, and just bind the cubemap faces using:
glDrawBuffer(GL_COLOR_ATTACHMENT0_EXT + face)
before rendering the occluder geometry.


Did you really mean 6 textures per face, if so could you explain? Are you clearing the depth? Maybe the way you are switching targets using glDrawBuffer isn't causing a stall for glClear.
Yeah, sorry, i meant 6 per cubemap.

here's how i create the cubemap...

		// FBO		glGenFramebuffersEXT(1, &shadow_fbo);		glBindFramebufferEXT(GL_FRAMEBUFFER_EXT,shadow_fbo);		//Gen Depth Buffer		glGenRenderbuffersEXT(1, &shadow_DepthBuffer);		glBindRenderbufferEXT(GL_RENDERBUFFER_EXT, shadow_DepthBuffer);		glRenderbufferStorageEXT(GL_RENDERBUFFER_EXT, GL_DEPTH_COMPONENT, Width, Height);		for(int j=0;j<6;++j){				shadow_cubeFaces[j]	=	_textureman->CreateFloat(Width, Height, 1);		}		//Attach Depth Buffer		glFramebufferRenderbufferEXT(GL_FRAMEBUFFER_EXT,GL_DEPTH_ATTACHMENT_EXT, GL_RENDERBUFFER_EXT, shadow_DepthBuffer);		for(int j=0;j<6;++j){			glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT+j, GL_TEXTURE_2D, shadow_cubeFaces[j], 0);		}		//Gen Cubemap Texture		glEnable(GL_TEXTURE_CUBE_MAP_ARB);		glGenTextures(1, &shadowCubeMap);		glBindTexture(GL_TEXTURE_CUBE_MAP_ARB, shadowCubeMap);		glTexParameteri(GL_TEXTURE_CUBE_MAP_ARB, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);		glTexParameteri(GL_TEXTURE_CUBE_MAP_ARB, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);		glTexParameteri(GL_TEXTURE_CUBE_MAP_ARB, GL_TEXTURE_WRAP_R, GL_CLAMP_TO_EDGE);		glTexParameteri(GL_TEXTURE_CUBE_MAP_ARB, GL_TEXTURE_MIN_FILTER, GL_LINEAR);		for(int j=0;j<6;++j){			glTexImage2D(GL_TEXTURE_CUBE_MAP_POSITIVE_X_ARB + j, 0, GL_RGBA16F_ARB, Width, Height, 0, GL_RGBA, GL_FLOAT, NULL);		}


and here's how i render the cubemap....

	CVector3	view[6];	CVector3	up[6];	view[0]	=	CVector3(0,0,1);	view[1]	=	CVector3(0,0,-1);	view[2]	=	CVector3(0,-1,0);	view[3]	=	CVector3(0,1,0);	view[4]	=	CVector3(1,0,0);	view[5]	=	CVector3(-1,0,0);	up[0]	=	CVector3(0,1,0);	up[1]	=	CVector3(0,1,0);	up[2]	=	CVector3(1,0,0);	up[3]	=	CVector3(-1,0,0);	up[4]	=	CVector3(0,1,0);	up[5]	=	CVector3(0,1,0);	glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, shadow_fbo);	for(int i=0;i<6;++i){		glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT+i, GL_TEXTURE_CUBE_MAP_POSITIVE_X_ARB+i, shadowCubeMap, 0);		glViewport(0,0,Width, Height);		glDrawBuffer(GL_DEPTH_ATTACHMENT_EXT);		glDrawBuffer(GL_COLOR_ATTACHMENT0_EXT+i);					glClearColor(0.0f,0.0f,0.0f,0.0f);		glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT);		glLoadIdentity();	        int numlights	=	_lightman->getNumOfVisibleLights();		if(numlights>4)			numlights=4;		CVector4	colormask[4];		colormask[0]	=	CVector4(1,0,0,0);		colormask[1]	=	CVector4(0,1,0,0);		colormask[2]	=	CVector4(0,0,1,0);		colormask[3]	=	CVector4(0,0,0,1);		for(int n=0;n<numlights;++n){			glColorMask(colormask[n].x, colormask[n].y, colormask[n].z, colormask[n].w);			CLight*		pLight	=	_lightman->getLight(n);			CVector3 pos	=	 pLight->getPosition();		cam->PositionCamera( pos, pos+view, up );			cam->LookClip(1,clip_near,far_near);			cam->UpdateFrustumFaster();			cam->CalculateFrustumCorners( cam->Position(), clip_near, far_near );RENDER GEOMETRY.......			glColorMask(1, 1, 1, 1);		}	}	



Hope that helps.


Thanks. That's what I thought. I tried binding all the textures once to the FBO and then use glDrawBuffer per face and I got nothing. It's likely your card is just that much faster. Also I don't think you need the glFramebufferTexture2DEXT in your drawing loop since glDrawBuffer is switching for you. Thanks anyways, I think the best thing is to update lights based on thier priority and whether something has moved in thier radius.
just tried it without glFramebufferTexture2DEXT, and it doesn't work. I'm guessing it's telling gl that the current draw buffer belongs to that cubemap face???

How do you determine you light priorities?

With mine, i'm currently doing a cube frustum check with the camera, and i intend on rendering the 6 cube frustums after the depth only pass with occlusion queries enabled. Then do a qsort on the lights based on their pixel visibility. Any better ideas?
I use a portal system. When a light is moved/placed, it figures out which rooms it can effect using portal culling and the radius of the light. So now each room has a list of lights that effects it. I then use portal culling to generate a linked list of reduced frustums/room pairs. Then for each room I gather all the lights it effects and add it to my set of relevant lights. From there I use distance priorities. I also do culling of shadow caster using frustums per face of the cubemap. If I end up needing a better set, I will likely add a check to test whether a light's frustum will interset the view frustum using the portals.

This topic is closed to new replies.

Advertisement