Jump to content

  • Log In with Google      Sign In   
  • Create Account

Awesome job so far everyone! Please give us your feedback on how our article efforts are going. We still need more finished articles for our May contest theme: Remake the Classics

Deferred shading - Materials attributes


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
30 replies to this topic

#1 outlawhacker   Members   -  Reputation: 122

Like
0Likes
Like

Posted 11 March 2008 - 11:04 AM

Hello, I'm looking into deferred shading to light up my scenes and I have now come to the time where I want to use different materials for my objects. In a forward renderer, this is quite straight-forward since every object is shaded individually and I can just pass needed material parameters to the shader that is used. But when doing deferred shading it gets a little complicated how to transfer the material attributes into the light phase efficiently. I first thought of storing a material ID in a texture together with some variables used for the material and then do dynamic branching in the light shader. But this felt a bit ugly and I'm probably bound to run out of instructions sooner or later. And in my experience dynamic branching in pixel shaders is very costly. I read a previous topic on GameDev touching this subject: http://www.gamedev.net/community/forums/topic.asp?topic_id=396527&PageSize=25&WhichPage=1 AndyTX says here:
Quote:
As I mentioned, the branching is highly coherent and so the performance hit is minimal.
He's talking about dynamic branching in the light shader, but what does it mean that the branching is "highly coherent" and why is this a minimal performance hit? Excuse me if I'm stupid here :P I also read about deferred shading in S.T.A.L.K.E.R. (in GPU Gems 2) and as I understand it they store their materials in a 3D texture and do texture lookups in the light shader. Andy TX about this:
Quote:
Furthermore another option (that I believe they used in STALKER) is to put your BRDFs into a 3D texture and use the material ID to choose the slice. This is a really easy, really high performance and quite flexable way of doing different materials without branching or multiple passes.
Once again sorry if I'm stupid, but how exactly is this done without multiple passes? I guess that the BRDF parameters are the ones to use in my lighting calculation, and these change every time I change the lighting or change the objects position/orientation or camera position/angle. So isn't these parameters written to a texture during the geometry phase as with depth, normal and diffuse RGB? If so, I can only fit so much info in my textures - I'm on my 4th and final texture to use in MRT geometry pass (plus some free space in the other textures). When and how is this 3D texture holding material attributes filled and how does this eliminate the need for branching...? Thanks!

Sponsor:

#2 mjshort   Members   -  Reputation: 122

Like
0Likes
Like

Posted 11 March 2008 - 10:30 PM

Im pretty sure they store a materia ID in one of the render targets, this id is then used to referance the 3D textures slice. and I think that they use the Normal and Light values as the texture coordinates (Like N.L). There is an example in the NVidia SDK I think.

#3 ndhb   Members   -  Reputation: 239

Like
0Likes
Like

Posted 11 March 2008 - 11:05 PM

Quote:
Original post by outlawhacker
He's talking about dynamic branching in the light shader, but what does it mean that the branching is "highly coherent" and why is this a minimal performance


It means something to the effect that the outcome of the conditional, does not change very often. As far as I understand, many GPUs process a whole batch of fragments at the same time. If your branch on some attributes of the fragments, and the branch is "highly coherent", the GPU can "guess" the result (branch prediction) and compute ahead in the branch that is usually taken. That may save a lot of work. The current trend is to process more and more fragments in parallel. Thus the benefit of branch predication on modern GPUs can very high. That is why shouldn't be afraid to branch on high-end hardware (especially if it's "highly coherent"). I've noticed a huge difference in shader performance with dynamic branching on GF8800 versus GF7800. Unless you target older hardware, don't bend over backwards to avoid dynamic branching.



#4 outlawhacker   Members   -  Reputation: 122

Like
0Likes
Like

Posted 12 March 2008 - 03:28 AM

ndhb, I did not know that. That sounds cool :)

mjshort - When and where is this 3D texture written to? My material attributes changes all the time so how could I store only a material ID?

I'll have a look in the NVIDIA SDK as well, thanks.

#5 wolf   Members   -  Reputation: 823

Like
0Likes
Like

Posted 12 March 2008 - 06:42 AM

This is one of the obvious drawbacks of deferred renderers and also the main reason why people do not want to use them :-) .. branching in the light shader is probably not expensive if you have a high-end card but on cards that are actually used in the PC market or in consoles it is so expensive that you don't want to think about it.
Look for Light indexed renderer or Light pre-pass renderer to see designs that are superior to the deferred renderer :-)

#6 outlawhacker   Members   -  Reputation: 122

Like
0Likes
Like

Posted 13 March 2008 - 01:43 AM

Quote:
Look for Light indexed renderer or Light pre-pass renderer to see designs that are superior to the deferred renderer :-)


Looks like a cool technique, but I would still like to know how to do different materials with my deferred renderer :P

mjshort - I have looked into NVIDIA samples without finding anything good about materials. Found one about deferred shading but nothing about using materials there...Any specific sample you where thinking about?

How is STALKER's way of storing only a material ID gonna help me get needed attributes from a certain material? Can someone please explain this technique to me like I was born yesterday :)

#7 draktheas   Members   -  Reputation: 122

Like
0Likes
Like

Posted 13 March 2008 - 04:16 AM

Wolf, can you ellaborate on your "Light indexed renderer or Light Pre-pass Renderer" a little? There are no details about it in your blog and your blog appears the be the only google result for those terms. Any concrete info would be greatly appreciated.

Drak

#8 AdAvis   Members   -  Reputation: 518

Like
0Likes
Like

Posted 13 March 2008 - 06:30 AM

Quote:
Original post by draktheas
Wolf, can you ellaborate on your "Light indexed renderer or Light Pre-pass Renderer" a little? There are no details about it in your blog and your blog appears the be the only google result for those terms. Any concrete info would be greatly appreciated.

Drak


I'm not wolf but you should check out this link. There is a paper in the downloads section. On a very high level what a light indexed deferred renderer does is store light information in the (much lighter and no longer aptly named) g-buffer in one pass. Subsequently geometry is rendered and lit by the light info in the g-buffer and by whatever funky illumination technique you're using.

#9 Bakura   Members   -  Reputation: 139

Like
0Likes
Like

Posted 13 March 2008 - 07:16 AM

I was going to create the same topic :). Due to the lighting system I use (instant radiosity), I have to use deferred rendering. But now I have a problem I can't solve. Let's say I have two boxes, and I want to apply a simple normal mapping shader to it. Since I draw a quad, which is the entire screen, how can I recognize this is the good box to apply normal mapping. And what if I want to remove the normal mapping if the object is too far from the camera ?

I don't understand quite well about the material ID... Could you explain it more clearly ?

Wolf > When can we expect learn more about your Light Pre-Pass Renderer ? :)

#10 wolf   Members   -  Reputation: 823

Like
0Likes
Like

Posted 13 March 2008 - 07:30 AM

Quote:
Wolf > When can we expect learn more about your Light Pre-Pass Renderer ? :)
I hope to find some time to write something up on the weekend and I would follow up then in this forum next week ... I try to do this since three months but there is always something else with higher priority happening on the weekend. In the meantime we nearly have it running in a game here, which is a good proof of concept :-)

#11 Bakura   Members   -  Reputation: 139

Like
0Likes
Like

Posted 13 March 2008 - 07:37 AM

I'm looking forward to getting a look at that method, it would be a great gift for my 18th birthday and my new computer I'll have (so, it'll be its first paper :D).

#12 AndyTX   Members   -  Reputation: 798

Like
1Likes
Like

Posted 13 March 2008 - 12:27 PM

First, I'll quote the reply that I just PMed to the original poster (I didn't notice this thread before I replied!):

"The best explanation is in the STALKER chapter in GPU Gems 2, so I'd recommend checking that out for all of the details.

The quick summary is that you're not actually rendering to a 3D texture... you have a static 3D texture in which you store BRDFs - each of the "slices" in the 3D texture is one "material". When rendering the g-buffer you just write out a material ID and use that to select which slice of the 3D texture to use when shading. The texture itself represents a 2D-parameterized BRDF. IIRC in STALKER the parameters are NdotL and NdotH. Thus when they want to evaluate lighting for a given light, they just compute NdotL and NdotH and lookup into their BRDF 3D-texture at coordinates (NdotL, NdotH, material) where "material" comes from the g-buffer.

It's a pretty simple approach that doesn't require dynamic branching, is quite efficient and is fairly flexible. Check out the chapter in GPU Gems 2 for more info."

A few more bullets:

* Modern fragment shaders have caused people to confuse the concepts of surface (BSDF) and light shaders (BRDF). The surface shaders in a deferred renderer run in the G-buffer creation pass and can be/are different per-object similar to forward renderers. This applies to normal mapping vs non, different colors, etc. etc... that's all fine with deferred shading. Only the BRDFs are evaluated in the deferred pass.

* You don't really need that many BRDFs... really one or two is usually fine for most scenes. Seriously, most people who freak out about this particular "limitation" of deferred rendering aren't really understanding the above bullet. That's not to say this is always true, but really... go re-read it then read some good books on the topic.

* If you need many BRDFs, use dynamic branching or table lookups. They're both fast and quite efficient. You can optionally use something like the light-index method that effectively defers "less" of the process, but honestly I'm doubtful that in a properly structured shading engine that separates surface and light data you'll notice much improvement in either performance or functionality (you may even get slower and less functional for more effort...).

* If you have some object or material that absolutely does not fit into the above model (usually because you're doing something odd or wrong, but whatever), just render it using forward rendering at the end. It's ugly, but probably so is what you're trying to do ;)

I don't mean to come off as harsh, but people seem to be spending a lot of time trying to "solve" or get around something that really is entirely a non-issue. Look at STALKER: it looks great and they didn't even have to do anything too fancy.

#13 bzroom   Members   -  Reputation: 640

Like
0Likes
Like

Posted 13 March 2008 - 12:43 PM

Quote:
Original post by Bakura
I was going to create the same topic :). Due to the lighting system I use (instant radiosity), I have to use deferred rendering. But now I have a problem I can't solve. Let's say I have two boxes, and I want to apply a simple normal mapping shader to it. Since I draw a quad, which is the entire screen, how can I recognize this is the good box to apply normal mapping. And what if I want to remove the normal mapping if the object is too far from the camera ?

I don't understand quite well about the material ID... Could you explain it more clearly ?

Wolf > When can we expect learn more about your Light Pre-Pass Renderer ? :)


Deferred rendering as you know is a two step process. I believe that the normal mapping should be done in GBuffer creation. Since the Gbuffer render pass is nearly identical to a forward shader's, very little has changed for normal mapping.

The only thing the shading pass needs to know is the surface normal for the light. It dosnt need to know anything about a normal map. There are very few properties that I can think of that actually affect the light itself and not the surface.

I'm sorry if someone else mentioned this, I havn't read the entire thread yet.

#14 wolf   Members   -  Reputation: 823

Like
0Likes
Like

Posted 13 March 2008 - 01:12 PM

Let me repeat why the deferred renderer idea did not work out in games:

1. you build your renderer following a graphics hardware design not the other way around .. otherwise you loose performance. Most hardware was build with a Z pre-pass renderer in mind. Let's say INTEL releases Larabee, you will want to build the renderer in a way that it squeezes out the highest performance from Larabee. This will be very interesting because it might require a renderer design that is different from other GPUs and it won't be what we call now deferred or forward ... it will be different :-)
2. most currently avaiable graphics cards do not have enough bandwidth to run a deferred renderer ... this is also true for the PS3 :-) ... so you loose large parts of the market and it makes it rather unattractive to build games like this
3. MSAA on DX9 hardware is not really easy ... on console platforms neither ... costs additional cycles to get it done ... more expensive than with a Z pre-pass renderer
4. There is no way to implement a halfway decent material system with a deferred renderer. A decent character setup with a skin, hair, cloth, eyes, eye-lashes shader is just not possible, so your games will look much worse.

Please do not forget a NVIDIA 8800 GTS is probably 10 times faster than a 8200 or however they call their low-end model but makes up only for < 1% of the market and probably also your target market. Even high-end console graphics chips are much slower than this card ... about comparable to a NVIDIA 7600 GT with a 128-bit bus. So from a financial and a quality stand-point it is not a good idea to do this :-)

#15 wolf   Members   -  Reputation: 823

Like
0Likes
Like

Posted 13 March 2008 - 01:16 PM

oh and let me follow up with the arguments that are brought up usually now:

1. Killzone 2 did not ship so far :-)
2. any other game that ever shipped with a deferred renderer has or will have serious problems to have nice looking characters :-) ... watch out for it.

btw: I helped to ship a game that uses a deferred renderer ... just want to save you the pain :-)

#16 AndyTX   Members   -  Reputation: 798

Like
1Likes
Like

Posted 13 March 2008 - 02:32 PM

Quote:
Original post by wolf
1. you build your renderer following a graphics hardware design not the other way around .. otherwise you loose performance. Most hardware was build with a Z pre-pass renderer in mind.

Deferred rendering really doesn't step on any toes here. There's nothing inefficient about the way that DR runs on modern graphics hardware. Indeed it is extremely SIMD-friendly and has the most predictable, cache-friendly data access patterns possible.

Quote:
Original post by wolf
2. most currently avaiable graphics cards do not have enough bandwidth to run a deferred renderer ... this is also true for the PS3 :-) ... so you loose large parts of the market and it makes it rather unattractive to build games like this

Sure, but currently mid-high end PC cards (certainly 8800GT+, and that card is $150 now!!) can easily handle deferred rendering. Thus it's only a matter of time before even the "low end" can. Quite honestly, the graphics card in the PS3 was outdated before it even arrived (G80 came out a week earlier IIRC). Thus I have no problem with DR being unsuitable for implementation on current console hardware, but it will have absolutely no problems on next generation stuff, and is already more than feasible on PCs.

Quote:
Original post by wolf
3. MSAA on DX9 hardware is not really easy ... on console platforms neither ... costs additional cycles to get it done ... more expensive than with a Z pre-pass renderer

Fine, but it works okay on DX10 and there are other options for AA. It's a disadvantage, sure, but it's not crippling.

Quote:
Original post by wolf
4. There is no way to implement a halfway decent material system with a deferred renderer. A decent character setup with a skin, hair, cloth, eyes, eye-lashes shader is just not possible, so your games will look much worse.

I have to completely disagree with this. Look at STALKER: it looks great and has actually stood the "graphics test of time" better than most games so far. I don't see anything in DR that makes the effects that you mention unreasonable or even difficult to implement. Again, if you're actually considering a reasonable lighting model (i.e. all lights work properly on all materials), there really are no problems. Even if you aren't doing "proper lighting" there are ways to hack things, just like with forward rendering.

Quote:
Original post by wolf
Please do not forget a NVIDIA 8800 GTS is probably 10 times faster than a 8200 or however they call their low-end model but makes up only for < 1% of the market and probably also your target market. Even high-end console graphics chips are much slower than this card ... about comparable to a NVIDIA 7600 GT with a 128-bit bus. So from a financial and a quality stand-point it is not a good idea to do this :-)

As I said, while that may be true today (and your note about 8800 penetration is not true according to the Steam survey - 8800s already make up a full 10%!!), if you're starting to code a game today, you can reasonable afford to target 8800-class hardware as a baseline.

In any case this is all tangential... there is nothing inherent about deferred rendering that makes it unsuitable for use on current or future GPUs. Indeed it is a very GPU-friendly algorithm and as we move forward it has quite a number of performance complexity and code simplicity benefits.

If you want to talk about specific hardware being unsuitable, fine, but for the record I had a fairly full-featured deferred renderer happily running on GeForce 6's and there's no way that one can argue that it's unsuitable for an 8800 or similar. PS3/360 are so last-gen ;)

Quote:
Original post by wolf
2. any other game that ever shipped with a deferred renderer has or will have serious problems to have nice looking characters :-) ... watch out for it.

I don't see that at all. Indeed as I mentioned, STALKER (being the biggest DR example I can think of now) looks very good, especially considering when it came out! Furthermore I can easily make a counter-claim: "any other game that ever shipped with a forward renderer has or will have serious problems with its lighting". I don't think either statement is true really.

Anyways in the end definitely use what works for you game - the thing that bugs me is people being put off of something potentially useful because of misinformation. DR is really easy to code, so at least do yourself the favour of trying it before dismissing it particularly if you're working with complex scenes and lighting environments.

[Edited by - AndyTX on March 13, 2008 9:32:14 PM]

#17 wolf   Members   -  Reputation: 823

Like
0Likes
Like

Posted 13 March 2008 - 05:23 PM

Quote:
Deferred rendering really doesn't step on any toes here.
Let me start by saying that I always enjoy a discussion like this :-) ... the problem is that it consumes much more bandwidth than other renderer designs and following the development in graphics hardware bandwidth becomes faster expensive than arithmetic instructions. In other words arithmetic instructions are becoming cheaper faster than bandwidth ... so the scenario for a deferred renderer becomes worse with faster cards because the advantage of other renderer designs grows there or in other words the disadvantage of a deferred renderer is becoming bigger with newer and faster graphics cards.
In general we move away from memory intensive calculations by replacing them by arithmetic instructions. This trend is even stronger on hardware platforms that have less memory "throughput" like consoles and maybe Larabee :-) ... so a future renderer design will like to replace four large MRT's who consume lots of bandwidth with smaller or less render targets and do more calculations instead.

#18 AndyTX   Members   -  Reputation: 798

Like
0Likes
Like

Posted 13 March 2008 - 06:46 PM

Quote:
Original post by wolf
the problem is that it consumes much more bandwidth than other renderer designs and following the development in graphics hardware bandwidth becomes faster expensive than arithmetic instructions.

While that trend is true, it's actually not that relevant to a discussion of deferred rendering. There are several reasons why:

* DR requires an effectively fixed amount of memory bandwidth per framebuffer pixel. It only scales with lighting complexity, and forward rendering scales at a worse rate here. Thus for complex lighting environments, DR uses *less* memory bandwidth than forward rendering. For simple lighting environments it may use more, but it uses only a constant factor more that is already covered by current hardware's available bandwidth. Thus it will not be a bottleneck down the road, if it even still is.

* The memory access patters of DR are entirely predictable and coherent. It's primarily *random access* and gather/scatter operations that cause stress on the memory subsystem as predictable access patterns can be entirely prefetched/cached/local stored depending on your architecture. Sure they still consume bandwidth, but again a constant factor multiple of the framebuffer pixels.

* DR is also quite compatible with bandwidth-saving techniques such as batching up similar light volumes to avoid redundant reads to the g-buffer (although these are cached so not really a problem, see above). However since the whole light volume aspect of DR is really just an performance optimization, there is an entire axis on which to tradeoff various memory/computation metrics.

* DR uses *significantly* less memory bandwidth and computation compared to the equivalent forward renderer for complex lighting situations, because such situations require multipass lighting in the forward renderer, which is a huge waste of all GPU resources.

So I'm not too concerned about the memory bandwidth usage of DR in the long run, as it isn't going to get much worse that it is currently. Indeed as lighting environments get more complex, DR scales nicely with the number of onscreen lit pixels (this is the optimal complexity). Forward renderers on the other hand remain suboptimal until you're dicing polygons up to subpixel sizes and resolving light contributions at that resolution... something which starts to sound a lot like deferred rendering ;)

It should also be noted that all of the techniques such as light-indexed DR and similar that still use the rasterizer to resolve light contributions have exactly the same scaling characteristics as typical DR - they're just playing with the constant factors.

Certainly the trend towards ALU operations vs memory operations is relevant to things like computing the BRDFs... i.e. down the road it will be better to use dynamic branching than a BRDF lookup table as STALKER uses. However the same argument cannot be applied to DR vs FR as it's not a simple ALU/memory tradeoff there as discussed above.

#19 Bakura   Members   -  Reputation: 139

Like
0Likes
Like

Posted 14 March 2008 - 01:08 AM

Quote:
Let me start by saying that I always enjoy a discussion like this :-) ... the problem is that it consumes much more bandwidth than other renderer designs and following the development in graphics hardware bandwidth becomes faster expensive than arithmetic instructions. In other words arithmetic instructions are becoming cheaper faster than bandwidth ... so the scenario for a deferred renderer becomes worse with faster cards because the advantage of other renderer designs grows there or in other words the disadvantage of a deferred renderer is becoming bigger with newer and faster graphics cards.
In general we move away from memory intensive calculations by replacing them by arithmetic instructions. This trend is even stronger on hardware platforms that have less memory "throughput" like consoles and maybe Larabee :-) ... so a future renderer design will like to replace four large MRT's who consume lots of bandwidth with smaller or less render targets and do more calculations instead.


And is your renderer design (Light Pre-Pass xxx) that's scalable ?

#20 wolf   Members   -  Reputation: 823

Like
0Likes
Like

Posted 14 March 2008 - 06:59 AM

In real-world applications no one is really using a deferred renderer or a forward renderer as it is described in a lot of places. In practice you do a lot of things additionally.
When I am talking about a Z pre-pass renderer, this is also not a forward renderer :-) ... you already store depth data upfront. Then you usually also store shadow data upfront etc., sometimes normal data as well. The idea is to make the right number of rendering passes, reduce memory bandwith, keep the pixel shader workload resonable and let the GPU and CPU do the right amount of stuff. This is usually very game related and you balance the game for this.
Everyone is currently treating transparent objects extra in a forward renderer (this is really a forward renderer). That breaks any consistent renderer design if you want to defer lots of data, because the Z pre-pass can't hold the transparent data. So there is no right or wrong just in-between and as I said the renderer design follows hardware not software paradigms. Depending on what the hardware engineers had in mind you want to build your renderer this way to squeeze out every cycle. Deferring lots of data does not fit to any hardware design currently but might be better on future hardware.
I am now more motivated to write up my renderer design paper on the weekend.

Great discussion :-)




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS