Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 01 Jan 2009
Offline Last Active Mar 17 2015 10:43 AM

Posts I've Made

In Topic: FBO Questions

05 July 2014 - 05:34 AM



The hardware depth test has certain optimizations in place which can significantly speed up the rendering of occluded fragments. However, those optimizations require additional memory which is why depth attachments are more then a simple texture. You can't just use them as a regular render target, write to them, and still expect that additional memory for the optimizations to be consistent.



This. Have in mind that writing directly to the depth buffer means no early Z rejection. The GPU can "preemptively" reject fragments before executing the fragment shader by just testing the resulting depth value from the vertex shader outputs.


If you write to the depth buffer directly, the GPU has to execute the fragment shader to know the actual depth value, thus no early Z rejection is possible.


The reason I initially wanted to do this is because my shadow maps for directional lights have 4 cascades, and I tried generating them simultaneously by quadrupling the geometry using  the geometry buffer. If I rendered only to the depth buffer I was able to avoid allocating an additional 4 R32f 2048x2048 color attachment slices.

After some profiling I noticed that this isn't really any faster than generating the cascades in individual passes though, provided that I use decent frustum culling for the individual cascades.




I think the best option is to allocate 3 buffers: The depth attachment, into which you render, the intermediate texture and the final texture. Note that the first 2 can be reused by other lights after the filtering, as long as the other shadowmaps have the same or a smaller size. 

This is precisely what I am doing now. It is also much easier to have either a linear or exponential shadow map this way, which is needed for ESM.


Thanks again!


In Topic: FBO Questions

04 July 2014 - 03:22 PM



You could write the depth values yourself onto that 32 bit float texture as a regular color attachment, then sample/modify them as you please, and just attaching a regular RBO to the depth attachment.


Now, I'm not sure if you could actually create a depth texture, attach it to the FBO as depth attachment, and also bind it to a texture image unit and sample from it.


Then again, if you're doing the writing in a final pass and you don't need to simultaneously sample from the depth buffer, you could always write what you need to the depth attachment directly in the fragment shader.

I've tried a few more different combinations, and it seems like it is not possible to create a texture that can be used both as a depth attachment and as a color attachment.

But as you've mentioned, I can either simply output to gl_FragDepth for the final pass, or for the shadow map generation, output the depth to a color attachment, using a shared RBO across all shadow maps.


Thank you for all the helpful comments!



In Topic: FBO Questions

04 July 2014 - 10:31 AM


It is explicitly not disallowed, because there are useful operations in this area (i.e. updating a particle system stored in a texture).


As Ohforf says, reading different pixels than you are writing is asking for trouble.

Good to know, thanks!




Nope. Just use GL_RED, GL_R32F with GL_FLOAT as type. Depth is just a value, write/read it as it were like any other texture (it is). 

But I can't use GL_32F as a depth attachment when generating the shadow map, can I?

In Topic: FBO Questions

04 July 2014 - 08:11 AM



@Question 1: I presume you want to reuse the memory of the shadow map? Because otherwise there is no reason to to use a DepthComponent texture as the final target.

Yes, that is correct. I would like to avoid allocating an extra texture for each shadow map if possible.




@Question 2: The results of doing that are usually "undefined" which means anything can happen (including the intended) but the behavior can be different for different vendors, driver versions, GPU-loads, ... In short: don't do it.
The texture cache, through which you read, is not kept coherent with the video memory so writing a pixel does not effect the copy of that pixel in the texture cache, which is probably what you are seeing here. In most cases, you can read and write to the same texture, if you only read one pixel per thread and it's the very pixel you write, but in your case you are reading more than one pixel.

Makes perfect sense, thanks!

In Topic: glDrawArraysInstanced performance on Intel hd2000/3000

01 April 2014 - 02:50 PM

Thanks for the suggestions!


Unfortunately the hd3000 does not support ARB_debug_output,

and afaik Intel's Graphics Performance Analyzer doesn't support OpenGL on Windows yet.