Sign in to follow this  
Zipster

Regarding MSAA in Killzone

Recommended Posts

Deferred Rendering in Killzone (8.26 MB) I'm guessing that a lot of you have seen the above PDF (or maybe the original presentation). It basically goes through their deferred renderer implementation, nothing too new or fancy. But what caught my attention was slide 39, where they claim to have implemented MSAA. Naturally this piqued my interest (since Yann L posted a thread a few months ago asking about combining deferred rendering and MSAA, and the consensus was that it was either impossible to do without bad artifacts or less efficient than super-sampling). On slide 21 they claim that a pro of their G-Buffer layout is that it allows MSAA, however my understanding is that you can't multisample data such as normals or depth because the lighting would be full of artifacts. I don't see how their structure or layout offers any improvement. Secondly, on slide 39, they say that they read from the G-Buffer and run the lighting shader for "both samples"... both samples of what? I might just not be familiar with how graphics are done are PS3 hardware, but how are they actually accessing individual samples? I'm at a slight disadvantage because I haven't looked at PS3 hardware in any real depth beyond it's general architecture, and I wasn't at the original presentation where a lot of this might have been explained in more detail, so hopefully someone can help me out and clarify exactly what it is Guerrilla is doing with MSAA.

Share this post


Link to post
Share on other sites
Quote:
Original post by Zipster
... Yann L posted a thread a few months ago asking about combining deferred rendering and MSAA, and the consensus was that it was either impossible to do without bad artifacts or less efficient than super-sampling


I haven't seen this discussion, but in my experience this indeed holds true on the PC platform. The problem stems from limitations in the API/hardware though and isn't inherent to the technique, so it might be possible to take advantage of the hardware's MSAA implementation on the PS in this scenario.

On slide 18 however, they mention they use something called "Quincunx MSAA", which might hint that they're using a software (shader) based solution. I'm not familiar with this technique, but a quick search on google seems to indicate it's an older technique using 1x2 samples which NVidia pattented for the GeForce3 (more here and here). I may be completely off-track, but I'd guess they implemented this 'simple' technique for the G-buffer reads in the lighting shader, which would explain the "both samples" on slide 39.

Share this post


Link to post
Share on other sites
The way it works is:
1. render with MS into the G-Buffer ... this buffer is 2560x720
2. then render with MS into the light or accumulation buffer that is also 2560x720 ... because the data in the G-Buffer is MS'ed you fetch two of those pixels at once by just averaging them or using Quincunx (described in Real-Time Rendering).
The reason why MS works in-between the light buffer and the G-Buffer is that you render your lights as geometry.
So what you get in the light buffer is "low" quality color data from the G-Buffer but high quality light data.
If you think about a deferred renderer on the PS3 do not forget that the RSX is actually a 7600 GT in PC graphics terms. It has a 128-bit memory bus and is very bandwidth limited. Additionally you do not have much memory to spare on a G-Buffer and a light buffer ... so overall it is not the solution you want to do on hardware that was build for a Z pre-pass renderer. here are the drawbacks:
- you loose memory
- you loose the ability to setup a really good material / meta-material system
- you loose performance because hardware's sweetspot is a Z pre-pass renderer
- you can't use many lights anyway because the memory bandwidth limitation keeps you from using many shadows and so what are many lights without shadows?

Just my 2 cents :-)

Zipster: you helped me quit often in the past thanks for this, ask me more via PM :-)

Share this post


Link to post
Share on other sites
Nope, please keep it public. I think this is a quite interesting topic. Maybe something new come up that has not been covered in the threads before. And thanks for the link, I didn't know of it before.

Share this post


Link to post
Share on other sites
Interesting presentation, thanks for the link.

My guess is that they're actually doing some type of SSAA like wolf describes (i.e. rendering double/quad-sized framebuffer) since people have just taken to using the term "MSAA" for all of these techniques nowadays. It's also possible that they're just not doing it correctly, but don't notice due to their particular lighting setup.

Quincunx generally refers to an AA method that shares samples between adjacent pixels (arguably conceptually similar to R600's "wide tent" filter). It's an efficient way to get some extra AA but has been criticized for bluring out high-frequency texture detail which is true to some extent, and largely responsible for why it was removed from NVIDIA's newer drivers.

And wolf I agree that deferred rendering seems a bit odd for the RSX, but perhaps still a win if they have a ton of local lights. I disagree that it limits your material system at all though (I've actually found it easier to work with if anything). The points about the rather weak memory subsystem of RSX are certainly relevant though...

As an aside, "proper" MSAA is completely possible and efficient in DX10 since you can now grab the pre-resolved MSAA samples. This may also be possible on the 360, but I don't know for sure.

Share this post


Link to post
Share on other sites
Quote:
My guess is that they're actually doing some type of SSAA like wolf describes (i.e. rendering double/quad-sized framebuffer) since people have just taken to using the term "MSAA" for all of these techniques nowadays. It's also possible that they're just not doing it correctly, but don't notice due to their particular lighting setup.
wolf knows that they do it :-) ... I work currently on a different PS3 and 360 title that uses deferred rendering ... guess which one :-)
oh and regarding deferred rendering on the 360. There are different arguments against it but I would not advise to use it there either :-) ...

[Edited by - wolf on August 14, 2007 8:05:40 AM]

Share this post


Link to post
Share on other sites
Dx10 high-end cards is the only place where you can really make deferred rendering look very good and I assume that with a wider distribution of this target platform, deferred rendering will also be more common there.

Share this post


Link to post
Share on other sites
Quote:
Original post by wolf
Dx10 high-end cards is the only place where you can really make deferred rendering look very good and I assume that with a wider distribution of this target platform, deferred rendering will also be more common there.

DX10 hardware and API do make deferred rendering even more attractive... I'd argue that STALKER did a pretty decent job on DX9 and I believe "Tabla Rasa" also supports a DX9 path. Both actually have good chapters in GPU Gems 2 and 3 respectively describing the trade-offs in their approaches. My experience meshes pretty well with the latter chapter (indeed there are parts of the chapter that I could well have written the arguments and structuring were so similar :)), and I certainly suspect that moving forward deferred rendering will become pretty popular, particularly in DX10+ hardware.

Share this post


Link to post
Share on other sites
Quote:
I believe "Tabla Rasa" also supports a DX9 path
no it does not. Check out the GPU Gems 3 article it says that they support deferred rendering only for DX10 ... I assume you have access to it.

Quote:
forward deferred rendering will become pretty popular, particularly in DX10+ hardware.
for people who need to make money with games it won't be very popular for the next two years. Up until then the install base might be big enough to pay for it :-)

Share this post


Link to post
Share on other sites
In general most renderers are anyway hybrids ... most deferred renderers need to render transparent stuff with a forward approach (Killzone 2, GTA IV even Tabula Rasa .. yep I read the article).
Most forward renderer use deferred elements like the Z pre-pass or deferred shadows ... so the lines are already blurry and they will become more blurry.

Share this post


Link to post
Share on other sites
Quote:
Original post by wolf
no it does not. Check out the GPU Gems 3 article it says that they support deferred rendering only for DX10 ... I assume you have access to it.

Ah yes fair enough. Still, there's not much critical in the DX10 API (other than the MSAA stuff) that makes the API necessary, although the hardware certainly handles it more efficiently.

Quote:
for people who need to make money with games it won't be very popular for the next two years. Up until then the install base might be big enough to pay for it :-)

Oh sure, but since I'm in research I can happily ignore all but the newest and upcoming hardware :) 2 years may be a full game to dev studios, but it's really not that far off overall.

And yes, many renderers are becoming hybrids and I agree with that design whole-heartedly: do what's the most efficient! However I also firmly believe that the light volume rendering/compositing aspect of deferred rendering is one of the most compelling reasons to use it, so once consumer hardware catches up, I suspect many games will begin to use that design as well (particularly those that try to do realistic, dynamic lighting/GI).

Share this post


Link to post
Share on other sites
Ah, Ok, so they're basically doing super-sampling but just calling it multi-sampling and getting away with it because that's the general term people use to describe these kinds of methods :) They had me going for a second there!

Quote:
Original post by wolf
The way it works is:
1. render with MS into the G-Buffer ... this buffer is 2560x720
2. then render with MS into the light or accumulation buffer that is also 2560x720 ... because the data in the G-Buffer is MS'ed you fetch two of those pixels at once by just averaging them or using Quincunx (described in Real-Time Rendering).
The reason why MS works in-between the light buffer and the G-Buffer is that you render your lights as geometry.
So what you get in the light buffer is "low" quality color data from the G-Buffer but high quality light data.

I was always under the impression that you didn't want to perform any kind of filtering on data from the G-Buffer, because stuff like normals and depth wouldn't interpolate right on edges and produce artifacts. So how are they getting away with averaging?

Also does 2560x720 (1280x720 without MS) mean Killzone only supports widescreen? Or that just one resolution they use?

Quote:
Original post by AndyTX
As an aside, "proper" MSAA is completely possible and efficient in DX10 since you can now grab the pre-resolved MSAA samples. This may also be possible on the 360, but I don't know for sure.

I remember you mentioned in Yahn's thread that even though it was possible to fetch the values, there's wasn't a flag to indicate which ones were the same, so it would really boil down to super-sampling on edge pixels at least?

Thanks everyone for the replies so far :-)

Share this post


Link to post
Share on other sites
Quote:
Original post by Zipster
I was always under the impression that you didn't want to perform any kind of filtering on data from the G-Buffer, because stuff like normals and depth wouldn't interpolate right on edges and produce artifacts. So how are they getting away with averaging?

The only way that they could be doing this "correctly" is to actually accumulate lighting on the "wide" framebuffer (super-sampled) and then average the resulting colours (MSAA "resolve"). They may however be doing it incorrectly and just don't notice, but in my experience the artifacts from that approach are pretty obvious...

Quote:
Original post by AndyTX
I remember you mentioned in Yahn's thread that even though it was possible to fetch the values, there's wasn't a flag to indicate which ones were the same, so it would really boil down to super-sampling on edge pixels at least?

Yes it would be nice to get access to the hardware "compressed" MSAA flags, but upon further reflection comparing the sub-sample depths for equality should be sufficient (and quite cheap). And yes you're effectively going to be super-sampling the edges, but that's actually the same cost as with standard MSAA (since you light and shade each of the pixels in each sub-sample separately). Indeed it's exactly what you want to do and should be quite efficient :)

Share this post


Link to post
Share on other sites
Quote:
comparing the sub-sample depths for equality should be sufficient (and quite cheap).

You don't have the guarantee of equality for non-edge samples AFAIK; the hardware will always interpolate depth at full resolution to ensure AA along the boundary of intersecting/abutting triangles.

Share this post


Link to post
Share on other sites
Quote:
Ah, Ok, so they're basically doing super-sampling but just calling it multi-sampling and getting away with it because that's the general term people use to describe these kinds of methods :) They had me going for a second there!
The idea is to run the pixel shader only 1280x720 time while you write into a 2560x720 light buffer ... so it is a kind of MS ... maybe only possible on a console platform where you are not limited by OpenGL or D3D and have direct access to a very thin software layer (libgcm) above the hardware.

I would be interested if this works as well on DX10 ... if anyone gives it a try.

Share this post


Link to post
Share on other sites
Quote:
Original post by SnowKrash
You don't have the guarantee of equality for non-edge samples AFAIK; the hardware will always interpolate depth at full resolution to ensure AA along the boundary of intersecting/abutting triangles.

Ah yes, that's a good point. So it would work if you're writing out depth yourself, but probably not if you're reading it from the MSAA depth buffer... but then again, I'm not totally sure how reading from an MSAA depth buffer would work then, so maybe it's unsupported anyways.

For deferred shading, it'd probably be best to just write out screen view space depth and use that to compare.

[Edit] Humus comes through with all of the answers :) Indeed you currently can't sample MSAA depth buffers, and he also suggests an API for checking the equality of sub-samples in a specific pixel

[Edited by - AndyTX on August 15, 2007 10:21:31 AM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Zipster
Ah, Ok, so they're basically doing super-sampling but just calling it multi-sampling and getting away with it because that's the general term people use to describe these kinds of methods :) They had me going for a second there!

You're unlikely to get clarification about what they're doing on a public forum, but trust me on this one: they have some very smart guys over at Guerilla and they very much know the difference between multisampling and supersampling!

Share this post


Link to post
Share on other sites
Quote:
Original post by Christer Ericson
Quote:
Original post by Zipster
Ah, Ok, so they're basically doing super-sampling but just calling it multi-sampling and getting away with it because that's the general term people use to describe these kinds of methods :) They had me going for a second there!

You're unlikely to get clarification about what they're doing on a public forum, but trust me on this one: they have some very smart guys over at Guerilla and they very much know the difference between multisampling and supersampling!

Oh I don't doubt it - you have either be insane or a genius to develop for the PS3, and they don't sound crazy to me! ;)

I just thought they might have been pulling a fast one by calling it multi-sampling when they really were doing super-sampling, because as AndyTX said the term "multi-sampling" is sometimes used to refer to the whole class of methods as opposed to any particular one, and I wasn't sure if they were being sneaky in that regard. But you're right, it's unlikely I'd get any response on a public forum...

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this