Sign in to follow this  

Render Target Formats for HDR Deferred Shading

This topic is 3337 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, I am playing around with the idea of using deferred shading with HDR on newer hardware (Say shader 3.0 nvidia 9000 series+) and was wondering what would be the best way to go about it. I was wondering how 64 bit floating point textures work out performance/quality wise on newer hardware. Unfortunately, MRT restrictions make it impossible to use the HalfVector4 format with 32 bit formats so I will either need split up the color information into two HalfVector2 textures, do a separate pass or use the logluv method of fitting it all into 32 bits. Which would be the best option? [Edited by - r691175002 on October 20, 2008 10:20:17 PM]

Share this post


Link to post
Share on other sites
Well you'll want something that you can additively blend no problem. LogLuv probably isn't such a great choice for that.

Why do you need MRT when you're rendering your HDR output? BTW the restriction you're talking about where all the RT's need to have the same bit depth...I believe this only applies to Nvidia 6-series and 7-series.

Share this post


Link to post
Share on other sites
MJP is right unless original poster wants to read HDR textures and store them into the GBuffer.

In that case I'd not move to a 64-bit GBuffer. LogLuv could be ok, but you should give a try to another solution: store a multiplier in the alpha channel. Alpha channel is "almost useless" in a deferred shader. By "almost useless" I mean sometimes you want to discard pixels according to the texture alpha channel. I suggest you to write 2 different shaders, one which converts HDR textures into 32bit RGBA with a multiplier in the alpha channel, and another one for non HDR textures that supports alpha test.

Share this post


Link to post
Share on other sites
All the games I know that use a G Buffer use three 8:8:8:8 render targets to store the material data and one depth buffer.
This is a good choice out of several reasons. On certain hardware three RTs in a MRT are the sweetspot, four area slower and obviously 8:8:8:8 render targets are more than double as fast as 16:16:16:16 because there are for example optimizations for alpha blending in there that the wider render targets do not have (clears are slower too etc.).
If you have a 8:8:8:8 G Buffer you can only HDR the lighting by having a higher res in the lighting buffer.
There are several tricks to achieve this. There is something called Quasi HDR where you render everything into a 8:8:8:8 render target and store a scale value in the alpha channel, then you resolve this into a 16:16:16:16 render target and PostFXs takes it from there. There is the L16uv or LogLuv model that is pretty cool. Because all this only applies to your opaque objects, you will have the challenge of finding appropriate ways to apply lots of lights to your alpha blended objects. Paying for a 16:16:16:16 render target here hurts especially .. you might think about reducing quality for those objects by going with a 8:8:8:8 render target.

Share this post


Link to post
Share on other sites
I'll take your advice and aim for 32 bit render targets. I will be rendering outdoor scenes with almost no transparency (Or where there is transparency, it will be small enough that skipping lighting for it wont be an issue). I was thinking of handling vegetation and fence type stuff with a method similar to the one outlined here ( http://www.kevinboulanger.net/grass.html ) which uses a mixture of alpha testing and blending so that artifacts are only minor around the edges which will hopefully reduce aliasing without looking ugly/murdering performance. Alpha will just write over the G buffer before lighting.

I would like to be able to use some HDR in the G Buffer such as an HDR skybox so I think using alpha for intensity will be more than enough.

I am pretty new to this 3d stuff so I also have a few more nagging questions: How do you prevent lighting areas with nothing (such as empty regions of a cleared buffer or the skybox?). Do I need to do anything with the stencil buffer for the lighting passes?

Finally, for accumulation of the lighting I assume that I should be using additive blending on a floating point surface?

After taking your advice into account, my plan is essentially:

8,8,8,8 - Color + Intensity
10,10,10,2 - Normal XYZ and nothing or a material lookup
32 - Depth
Unfortunately, I will probably have to throw in a fourth texture for lighting parameters unless there is a way to use the depth buffer in xna (PC). I also wouldn't mind storing motion x and y vectors for motion blur.

Accumulate lights in HalfVector4

Combine lighting and g buffer into a new HalfVector4 and run that through post processing.

Is this essentially how it's done?

Share this post


Link to post
Share on other sites
You might consider Blizzard's approach with Starcraft 2 where they used 4 x RGBA16 targets for the G-Buffer. This is handy for things like HDR light values, linear colors, high precision for normals, linear depth, etc. So you can store all this data in its raw form, without worrying about a packing/unpacking step.

If you'd like to look into it, here's a link to the pdf:
http://www.scribd.com/doc/4898192/Graphics-TechSpec-from-StarCraft-2

Share this post


Link to post
Share on other sites
Quote:
Original post by n00body
You might consider Blizzard's approach with Starcraft 2 where they used 4 x RGBA16 targets for the G-Buffer. This is handy for things like HDR light values, linear colors, high precision for normals, linear depth, etc. So you can store all this data in its raw form, without worrying about a packing/unpacking step.

If you'd like to look into it, here's a link to the pdf:
http://www.scribd.com/doc/4898192/Graphics-TechSpec-from-StarCraft-2


It's convenient alright, but you'll sure pay for that convenience. You're talking double the bandwidth and storage requirements. IMO it's well worth taking the time to write some code for packing and unpacking your G-Buffer.

Share this post


Link to post
Share on other sites
I agree that blizzards setup hits me as quite excessive, but it's nice to know that it is possible to use as much as 256 bits/pixel in the g buffer. I'm sure blizzard didn't take that step without making absolutely sure it was a practical solution.

Share this post


Link to post
Share on other sites
... they have fall-back pathes ... otherwise they loose most of the market out there :-) ... and Starcraft 2 did not ship so far, as far as I remember. They might change this when they are in Q&A and figure out that their target market is quite small with that setup.

Share this post


Link to post
Share on other sites
Yeah, I don't really want to spend my time coding fallbacks which is why I'm trying to limit myself to recent shader3.0+. I'm even considering just working only with 4.0 since I'm expecting it to take at least a year to finish anything worthwhile, but I'm sure XP will still be around so I'm holding back.

I don't mind aiming a little higher than the current hardware, but I don't want to do anything stupid either.

Share this post


Link to post
Share on other sites
Considering that they stuff normals and depth into their G-Buffer, wouldn't the drop from 16 to 8-bit make the arrangement unusable?

On a side note, S.T.A.L.K.E.R. used a similar G-Buffer, also consisting of RGBA16 textures.

Share this post


Link to post
Share on other sites
Quote:
Original post by n00bodyOn a side note, S.T.A.L.K.E.R. used a similar G-Buffer, also consisting of RGBA16 textures.

The Leadwerks engine uses the A2R10G10B10 format for normals which seems to work out pretty well.
I took another look at the stalker article in gpu gems though and you are right, this line seems to hit the issue perfectly:

Quote:
Because of a limitation of current graphics hardware that requires the same bit depth for surfaces used in the MRT operation, the albedo-gloss surface was expanded to full A16R16G16B16F. This encoding could be seen as a waste of video memory bandwidth—and it is indeed. However, this "wasteful" configuration was actually the fastest on a GeForce 6800 Ultra, even when compared to smaller, lower-quality alternatives, because of its much simpler decoding.

On the other hand, they are only using three render targets (They don't need depth).
Just out of interest, what is the expected performance cost of splitting the creation of the G-buffer into two passes? I am having a hard time imagining extreme performance costs since post processing, reflection, refraction, shadow maps and many other techniques all seem to make use of many passes.

Share this post


Link to post
Share on other sites
From what I've read, 3 MRTs may be the sweet spot, but splitting into multiple passes is a no-no. Using 4 MRTs at once may be a bit slower, but it is only one pass for all objects. If you have a lot of objects on screen, you have to re-render all of them for a second pass, killing most benefit you would get from hitting the MRT sweet-spot.

As for you comment on Leadwerks, they take a totally different approach. They recover clip-space position from the post-projection depth. While this has the advantage of reducing the number of buffers you need, it has the disadvantage that it requires more complicated math.

Blizzard went with the additional 16-bit depth buffer to store linear depth so that they could recover the object position with very simple math. It's the same approach that Crytek took to this problem.

Share this post


Link to post
Share on other sites
Crytek uses a Z Pre-Pass renderer.

<shameless plug> I think that a Light Pre-Pass is a much more clever solution to the problem of having multiple lights :-). More and more games are picking it up </shameless plug>

Share this post


Link to post
Share on other sites
Quote:
Original post by wolf
Crytek uses a Z Pre-Pass renderer.


He's talking about how they store linear depth and reconstruct world-space position from it for shadow calculation and other effects. If you're going to render out a depth texture, it's probably the best way to do it since it lets you avoid a matrix multiplication for reconstructing position.

Quote:
Original post by wolf
<shameless plug> I think that a Light Pre-Pass is a much more clever solution to the problem of having multiple lights :-). More and more games are picking it up </shameless plug>


Light Pre-Pass? Sounds like rubbish. :P

In all seriousness I agree, it's a very neat approach. Personally I'm a fan of any deferred techniques that allow use of MSAA without resorting to super-sampling in the shader, even if it results in a few artifacts. [smile]

Naughty Dog also had a pretty neat approach in Uncharted, that they describe here.

Share this post


Link to post
Share on other sites
I took a look at the light prepass and I like it a lot. I am a little unsure of the best way to store the lighting properties though. Ideally I would like to have a diffuse RGB and a specular RGB to which I can apply an arbitrary exponent during the main pass. As far as I can guess, the best way to do this is to store the diffuse and specular coefficients (using a power of 1) along with the color of the light and then apply the power and multiply both coefficients by the light color in the main pass. The problem is that this ends up with 5 values and I have no clue what else can be put in/cut out (Although I suppose specular color isn't really worth a second render target...). Since I would like to have HDR lighting I would be using floating point textures for the lighting accumulation.

Also, would rendering linear depth + normals to a single HalfVector4 render target work?

Share this post


Link to post
Share on other sites
I've looked at Wolfs entry about the "light pre-pass", but can someone explain the benfits for this method as well as what it's supposed actually replace?

---------------------------------------------------------------

I use a differed renderer in which I use two render targets, one is R8G8B8A8 for color with material properties indexed in the alpha channel, and one A32R32G32B32F to store normals and depth.

Share this post


Link to post
Share on other sites

This topic is 3337 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this