• Advertisement
Sign in to follow this  

Slow deferred lighting volumes

This topic is 3002 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hey all, I'm working on deferred rendering. I'm trying to constrain the area to which my lighting shaders need to run using stenciled lighting volumes. Nothing unusual. Basically, it looks like my program runs faster WITHOUT the stenciling. 65fps vs 55fps. This doesn't make much sense. Here's how I add a stencil buffer to the FBO: glGenTextures(1, &STENCIL_ID); glBindTexture(GL_TEXTURE_2D, STENCIL_ID); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE); glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, FBO_ID); glTexImage2D (GL_TEXTURE_2D, 0, GL_DEPTH24_STENCIL8_EXT, width, height, 0, GL_DEPTH_STENCIL_EXT, GL_UNSIGNED_INT_24_8_EXT, NULL); glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, GL_DEPTH_ATTACHMENT_EXT, GL_TEXTURE_2D , STENCIL_ID, 0); glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, GL_STENCIL_ATTACHMENT_EXT, GL_TEXTURE_2D , STENCIL_ID, 0); GL_STENCIL_ATTACHMENT_EXT, GL_RENDERBUFFER_EXT, STENCIL_ID); glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, 0); glBindTexture(GL_TEXTURE_2D, 0); Heres how I stencil: frameBuffer->Bind(); glClear(GL_STENCIL_BUFFER_BIT | GL_DEPTH_BUFFER_BIT); glEnable(GL_STENCIL_TEST); glStencilFunc(GL_ALWAYS, 1, 1); glStencilOp(GL_KEEP, GL_KEEP, GL_REPLACE); glColorMask(GL_FALSE, GL_FALSE, GL_FALSE, GL_FALSE); glDepthMask(0); glPushMatrix(); //draw volume glTranslatef(position[0], position[1], position[2]); glScalef(range, range, range); glCallList(sphereDL); glPopMatrix(); glColorMask(GL_TRUE, GL_TRUE, GL_TRUE, GL_TRUE); glDepthMask(1); glStencilFunc(GL_EQUAL, 1, 1); glStencilOp(GL_KEEP, GL_KEEP, GL_KEEP); --RENDER Any ideas?

Share this post


Link to post
Share on other sites
Advertisement
Bump! Surely somebody has used stencil buffers on a FBO before!

I've tried using renderbuffers instead. No luck - still slower:

GLuint packed_rb;
glGenRenderbuffersEXT(1, &packed_rb);

glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, FBO_ID);
glBindRenderbufferEXT(GL_RENDERBUFFER_EXT, packed_rb);
glRenderbufferStorageEXT(GL_RENDERBUFFER_EXT, GL_DEPTH24_STENCIL8_EXT, width, height);

glFramebufferRenderbufferEXT(GL_FRAMEBUFFER_EXT, 0x821A, GL_RENDERBUFFER_EXT, packed_rb); // had to use hex value, no entry in glew
glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, 0);

I have a 8800 GTS 640mb if it matters.

Maybe stencil buffers on FBO's are just intrinsicly slow???

Share this post


Link to post
Share on other sites
Ok I figured it out! I should let all you people from the future know the answer.

Writing to the FBO's stencil buffer is indeed slow. I found out that if I do:

glStencilFunc(GL_ALWAYS, 1, 1);
glStencilMask(1)

-- RENDER LIGHT VOLUME

glStencilMask(0)

This prevents opengl from automatically changing every stencil buffer value via the StencilOp.

After the stencil comparison is done I do:

glStencilFunc(GL_ALWAYS, 0, 0);
glStencilMask(1)

-- RENDER LIGHT VOLUME

glStencilMask(0)

and it effectively erases all that last light volume wrote.

I gained 5 fps! :)

Share this post


Link to post
Share on other sites
A better solution is to not use the stencil buffer at all. You can use a little trick with light volumes. Usually you render a full screen quad masked by the stencil buffer. Hence for each touched pixel you invoke the shader. Now think about this. Why using a full screen quad with stencil buffer if you only want to render "visible" pixels once? You don't have to. Render the light volume as a 3D volume as you would render an object. The depth buffer takes care of not running the shader on any hidden pixel ( early z rejection ) and the back face culling ( since light volumes usually are convex ) takes care of making each pixel being touched at most once. Just take care when the camera is inside the light volume. In this case render the volume with inverted culling ( cull front faces, draw back faces ).

The stencil technique can help to more aggressively remove pixels from rendering but is slower due to the added overdraw ( render light volume and full screen quad instead of just a light volume ). Hope this adds some thought food.

EDIT: Your technique has a fault. You need clearing the stencil buffer otherwise it won't work the way you expect. Hence your last variation fails to work. If you render first with stencil=0, then stencil=1 and then again stencil=0 you will get wrong stencil results although your shader will fix it. Imagine you render a cube in the middle of the screen. Pixels are stencil=0 now. Now you render another light volume next to it. These values become stencil=1 while the old ones stay stencil=0. Now you render again with stencil=0 again with the volume at a different location. Now the pixels hit by the first and third volume both are stencil=0 and all these pixels are run through the shader loosing speed and removing the wished benefit. The main reason why I abandoned stencil buffering and am using now the light volume trick I outlined.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement