Sign in to follow this  
Tesshu

Slow point light shadows using cubemaps.

Recommended Posts

Tesshu    713
I am having trouble getting my frame rate past 25fps. If I remove the depth clear I get 60fps. I am using a FBO with a 16bit depth and the cubemap is 1024x1024 RGBA8. If I reduce this to 512x512 I get 50fps and 70fps without clears. The clearing is just glClear with depth bit, I have also tried clearing stencil with it. Using a 8500GT with vsync off.
u32	shadowMap = 0;

for( u32 light = 0; light < numLights; light += cm_MaxLights, ++shadowMap )
{
	// These cubemaps are already created.
	ShadowMap*   pMap = shadowMapLevel( shadowMap );

	// FBO is also already created.
	FrameBuffer* pFBO = acquireFrameBuffer( pMap->vSize );

	glBindFramebufferEXT( GL_FRAMEBUFFER_EXT, pFB->m_FrameBuffer );

	// Set projectin
	glMatrixMode( GL_PROJECTION);
	glLoadMatrixf( mProj );

	glViewport( 0, 0, pMap->vSize.x, pMap->vSize.y );
	glColorMask( 1, 1, 1, 1 );
	glClear( GL_COLOR_BUFFER_BIT );

	for( u32 face = 0; face < 6; ++face )
	{
		glFramebufferTexture2DEXT( GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT, GL_TEXTURE_CUBE_MAP_POSITIVE_X + face, pMap->cubeMap, 0 );

		switch( subLight )
		{
			// Each light uses one color channel (like Humus example)
			case 0:	glColorMask( 1, 0, 0, 0 );	break;
			case 1:	glColorMask( 0, 1, 0, 0 );	break;
			case 2:	glColorMask( 0, 0, 1, 0 );	break;
			case 3:	glColorMask( 0, 0, 0, 1 );	break;
		}

		glClear( GL_DEPTH_BUFFER_BIT );
		glMatrixMode( GL_MODELVIEW);
		glLoadMatrixf( MakeCubeMapTransform( pLight->getPos(), face ) );

		// Render effected geom.
	}
}

glColorMask( 1, 1, 1, 1 );

Share this post


Link to post
Share on other sites
_Lopez    142
Firstly, if you are using a cubemap frame buffer, shouldn't this line


glFramebufferTexture2DEXT( GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT, GL_TEXTURE_CUBE_MAP_POSITIVE_X + face, pMap->cubeMap, 0 );




have the color attachment incremented for each face?

GL_COLOR_ATTACHMENT0_EXT+face

as each cubemap face should be attached to one of 6 render targets.



Share this post


Link to post
Share on other sites
Tesshu    713
Quote:
Original post by _Lopez
Firstly, if you are using a cubemap frame buffer, shouldn't this line

*** Source Snippet Removed ***

have the color attachment incremented for each face?

GL_COLOR_ATTACHMENT0_EXT+face

as each cubemap face should be attached to one of 6 render targets.


Thanks for the reply. The code does work, just not as fast as I want. I bind each cubemap face as I need it to the FBO. Maybe you are suggesting I bind all the faces GL_COLOR_ATTACHMENT0_EXT - GL_COLOR_ATTACHMENT5_EXT when I create the cubemap and then switch draw targets using glDrawBuffer GL_COLOR_ATTACHMENT0_EXT + face)? It's worth a try, thanks.

On a side note, I am thinking that maybe I could make this faster by just doing it less. I did a test and it seems that I can reuse the cubemaps and have a huge win. This is likely what people do.

Share this post


Link to post
Share on other sites
_Lopez    142
I cache the 4 most visible lights with shadows, and use linear distance computed in the vertex shader. I use 6 x 1024x1024 16bit float textures for each cubemap face, and just bind the cubemap faces using:
glDrawBuffer(GL_COLOR_ATTACHMENT0_EXT + face)
before rendering the occluder geometry.

I think you will get a bit of a speedup doing it this way - With 4 point lights, my framerate was about 60-100fps for forward rendering, but when i switched to deferred rendering i got a substantial speed boost. (8600gt).

Also, I don't know why you need to tranform the modelview matrix for each face.

Just setup a camera at the light position, and use 6 precomuted view, and up vectors.

Share this post


Link to post
Share on other sites
Tesshu    713
Quote:
Original post by _Lopez
I cache the 4 most visible lights with shadows, and use linear distance computed in the vertex shader. I use 6 x 1024x1024 16bit float textures for each cubemap face, and just bind the cubemap faces using:
glDrawBuffer(GL_COLOR_ATTACHMENT0_EXT + face)
before rendering the occluder geometry.


Did you really mean 6 textures per face, if so could you explain? Are you clearing the depth? Maybe the way you are switching targets using glDrawBuffer isn't causing a stall for glClear.

Share this post


Link to post
Share on other sites
_Lopez    142
Yeah, sorry, i meant 6 per cubemap.

here's how i create the cubemap...


// FBO
glGenFramebuffersEXT(1, &shadow_fbo);
glBindFramebufferEXT(GL_FRAMEBUFFER_EXT,shadow_fbo);
//Gen Depth Buffer
glGenRenderbuffersEXT(1, &shadow_DepthBuffer);
glBindRenderbufferEXT(GL_RENDERBUFFER_EXT, shadow_DepthBuffer);
glRenderbufferStorageEXT(GL_RENDERBUFFER_EXT, GL_DEPTH_COMPONENT, Width, Height);
for(int j=0;j<6;++j){
shadow_cubeFaces[j] = _textureman->CreateFloat(Width, Height, 1);
}
//Attach Depth Buffer
glFramebufferRenderbufferEXT(GL_FRAMEBUFFER_EXT,GL_DEPTH_ATTACHMENT_EXT, GL_RENDERBUFFER_EXT, shadow_DepthBuffer);
for(int j=0;j<6;++j){
glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT+j, GL_TEXTURE_2D, shadow_cubeFaces[j], 0);
}

//Gen Cubemap Texture
glEnable(GL_TEXTURE_CUBE_MAP_ARB);

glGenTextures(1, &shadowCubeMap);
glBindTexture(GL_TEXTURE_CUBE_MAP_ARB, shadowCubeMap);
glTexParameteri(GL_TEXTURE_CUBE_MAP_ARB, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_CUBE_MAP_ARB, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_CUBE_MAP_ARB, GL_TEXTURE_WRAP_R, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_CUBE_MAP_ARB, GL_TEXTURE_MIN_FILTER, GL_LINEAR);

for(int j=0;j<6;++j){
glTexImage2D(GL_TEXTURE_CUBE_MAP_POSITIVE_X_ARB + j, 0, GL_RGBA16F_ARB, Width, Height, 0, GL_RGBA, GL_FLOAT, NULL);
}




and here's how i render the cubemap....


CVector3 view[6];
CVector3 up[6];

view[0] = CVector3(0,0,1);
view[1] = CVector3(0,0,-1);
view[2] = CVector3(0,-1,0);
view[3] = CVector3(0,1,0);
view[4] = CVector3(1,0,0);
view[5] = CVector3(-1,0,0);

up[0] = CVector3(0,1,0);
up[1] = CVector3(0,1,0);
up[2] = CVector3(1,0,0);
up[3] = CVector3(-1,0,0);
up[4] = CVector3(0,1,0);
up[5] = CVector3(0,1,0);


glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, shadow_fbo);

for(int i=0;i<6;++i){

glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT+i, GL_TEXTURE_CUBE_MAP_POSITIVE_X_ARB+i, shadowCubeMap, 0);
glViewport(0,0,Width, Height);

glDrawBuffer(GL_DEPTH_ATTACHMENT_EXT);
glDrawBuffer(GL_COLOR_ATTACHMENT0_EXT+i);

glClearColor(0.0f,0.0f,0.0f,0.0f);
glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT);
glLoadIdentity();


int numlights = _lightman->getNumOfVisibleLights();

if(numlights>4)
numlights=4;

CVector4 colormask[4];
colormask[0] = CVector4(1,0,0,0);
colormask[1] = CVector4(0,1,0,0);
colormask[2] = CVector4(0,0,1,0);
colormask[3] = CVector4(0,0,0,1);

for(int n=0;n<numlights;++n){
glColorMask(colormask[n].x, colormask[n].y, colormask[n].z, colormask[n].w);

CLight* pLight = _lightman->getLight(n);
CVector3 pos = pLight->getPosition();
cam->PositionCamera( pos, pos+view[i], up[i] );
cam->LookClip(1,clip_near,far_near);
cam->UpdateFrustumFaster();
cam->CalculateFrustumCorners( cam->Position(), clip_near, far_near );


RENDER GEOMETRY.......
glColorMask(1, 1, 1, 1);

}

}






Hope that helps.


Share this post


Link to post
Share on other sites
Tesshu    713
Thanks. That's what I thought. I tried binding all the textures once to the FBO and then use glDrawBuffer per face and I got nothing. It's likely your card is just that much faster. Also I don't think you need the glFramebufferTexture2DEXT in your drawing loop since glDrawBuffer is switching for you. Thanks anyways, I think the best thing is to update lights based on thier priority and whether something has moved in thier radius.

Share this post


Link to post
Share on other sites
_Lopez    142
just tried it without glFramebufferTexture2DEXT, and it doesn't work. I'm guessing it's telling gl that the current draw buffer belongs to that cubemap face???

How do you determine you light priorities?

With mine, i'm currently doing a cube frustum check with the camera, and i intend on rendering the 6 cube frustums after the depth only pass with occlusion queries enabled. Then do a qsort on the lights based on their pixel visibility. Any better ideas?

Share this post


Link to post
Share on other sites
Tesshu    713
I use a portal system. When a light is moved/placed, it figures out which rooms it can effect using portal culling and the radius of the light. So now each room has a list of lights that effects it. I then use portal culling to generate a linked list of reduced frustums/room pairs. Then for each room I gather all the lights it effects and add it to my set of relevant lights. From there I use distance priorities. I also do culling of shadow caster using frustums per face of the cubemap. If I end up needing a better set, I will likely add a check to test whether a light's frustum will interset the view frustum using the portals.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this