Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!

1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


Member Since 29 Mar 2007
Offline Last Active Today, 10:10 PM

#5215989 Quick question about importance sampling

Posted by MJP on 12 March 2015 - 12:26 AM

Personally I have always followed the approach outlined in this paper by Walter et. al., although I am by no means an expert in path tracing or monte carlo techniques. Basically you generate a microfacet normal based on your normal distribution, and then you reflect the view vector about that normal to compute the sampling direction. That paper has the formulas for generating the microfacet normals for 3 distributions (Beckmann, Blinn-Phong, and GGX), and also shows you how to roll the PDF and BRDF together into a single, simplified sample weight.


If you need to combine your specular BRDF with a diffuse term, I use the technique outlined in Physically Based Rendering. Essentially for every sample, you randomly choose between specular and diffuse with equal probability. For Lambertian diffuse, you can just sample a cosine-weighted hemisphere to importance sample the BRDF. 

#5215698 Allowed format in Create View

Posted by MJP on 10 March 2015 - 01:21 PM

That list from the DXGI programming guide is the same exact list given in the documentation that you linked to. The only difference is that the DESC documentation states that you can also specify DXGI_FORMAT_UNKNOWN, which is just a way of telling the API "use the same exact format that was used when creating the texture".


You can't create a resource with one format, and then create a resource view with a format that has a different size. See the docs here about strong vs. weak typing. Basically if you create a resource with a TYPELESS format, then you can only create a resource view from the same format "family".

#5215590 What are your opinions on DX12/Vulkan/Mantle?

Posted by MJP on 10 March 2015 - 12:10 AM

Overall, I think it's great to see renewed focus on lower CPU overhead. It's sometimes ridiculous just how much more efficient it can be to build command buffers on console vs. PC, and the new API's look poised to close that gap a bit. Mobile in particular is the real winner here: they desperately need an API is more efficient so that they can save more battery power, and/or cram in more draw calls (from what I've heard, draw call counts in the 100's is pretty common on mobile). So far I haven't seen any major screw-ups from the GL camp, so if they keep going this way they have a real shot at dislodging D3D as the defacto API for Windows development. However, I think I still have more trust in MS to actually deliver exactly what they've promised, so I will reserve full judgement until Vulkan is actually released.


Personally, I'm mostly just looking forward to having a PC version of our engine that's more inline with our console version. Bindless is just fantastic in every way, and it's really painful having to go back to the old "bind a texture at this slot" stuff when working on Windows (not to mention it makes our code messy having to support both). Manual memory management and synchronization can also be really powerful, not to mention more efficient. Async compute is also huge if you use it correctly, and hopefully there will be much more discussion about it now that it will be exposed in public API's.


On the flip side, I am a bit concerned about sync issues. Sync between CPU and GPU (or even the GPU with itself) can lead to some really awful, hard-to-track down bugs. It's bad because you might think that you're doing it right, but then you make a small tweak to a shader and suddenly you have artifacts. It's hard enough dealing with that for one hardware configuration, so it's a little scary to imagine what could happen for PC games that have to run on everything. Hopefully there will be some good debugging/validation functionality available for tracking this down, otherwise we will probably end up with drivers automatically inserting sync points to prevent corruption (and/or removing unnecessary syncs for better performance). Either way, beginners are probably in for a rough time. sad.png

#5215587 Questions about GPGPU

Posted by MJP on 09 March 2015 - 11:35 PM

I should also point out that CUDA is basically C++, but missing a few things.

#5215584 Allowed format in Create View

Posted by MJP on 09 March 2015 - 11:30 PM

If you want to know which formats are supported for different uses, then go here and browse to the appropriate page for the feature level that you're targetting.

#5215073 What physical phenomenon does ambient occlusion approximate?

Posted by MJP on 06 March 2015 - 08:23 PM

I feel like it's a little unfair to just say that AO is fundamentally broken or not useful, even in the context of physically-based rendering. PBR in general is still full of approximations and weak assumptions, but the techniques are still useful and can produce more "correct" results compared to alternatives since they're still at least attempting to model real-world physical phenomena. A particularly relevent example is the geometry/shadowing term in microfacet BRDFs: it's been pointed out many times how they make the (incorrect) assumption that neighboring microfacets only occlude light instead of having light bounce off them, and yet it's still better than having no occlusion term at all. It's also a much more feasible solution compared to something crazy like path tracing at a microfacet level, so you can also just look at it as a reasonable trade-off given performance constraints. I feel like you can look at AO the same way: sure it makes assumptions that are usually wrong, but if used judiciously it could absolutely give you better results as opposed to not having it. For instance, representing occlusion from dynamic objects and characters. Sure you'd get better results if you simulated local light bouncing off those objects, but it's a heck of a lot cheaper to just try to capture their occlusion in AO form and can still be a lot better than having no occlusion whatsoever.

#5214400 What physical phenomenon does ambient occlusion approximate?

Posted by MJP on 04 March 2015 - 02:10 AM

The short version: AO approximates the occlusion of direct lighting from a distant hemispherical light source


The long version:


AO essentially approximates the shadowing/visibility that you would use for computing diffuse lighting from a distant light source that covers the entire hemisphere that's visible for a surface (hence Hodgman's comment about sky lighting). It actually works fairly well for the case where you have an infinitely distant light source that has a constant incoming lighting for every direction on the hemisphere. To understand why, let's look at the integral for computing AO:


AO = IntegrateAboutHemisphere(Visibility(L) * (N dot L)) / Pi


where "Visibility(L)" is a visibility term that returns 0.0 if a ray intersects geometry in the direction of 'L', and 1.0 otherwise. Note that the "1 / Pi" bit is there because the integral of the cosine term (the N dot L part) comes out to Pi, and so we must divide by Pi to get a result that's of the range [0, 1].


Now let's look at the integral for computing direct diffuse lighting (no indirect bounce) from an infinitely distant light source (like the sky), assuming a diffuse albedo of 1:


Diffuse = IntegrateAboutHemisphere(Visibility(L) * Lighting(L) * (N dot L)) / Pi


where "Lighting(L)" is the sky lighting in that direction. 


When we use AO for sky light, we typically do the equivalent of this:


Diffuse = AO * IntegrateAboutHemisphere(Lighting(L) * (N dot L)) / Pi,


which if we substitute in the AO equation gives us this:


Diffuse = (IntegrateAboutHemisphere(Visibility(L) * (N dot L)) / Pi) * (IntegrateAboutHemisphere(Lighting(L) * (N dot L)) / Pi)


Unfortunately this is not the same integral that we used earlier for computing direct lighting, since you generally can't just pull out terms from an integral like that and get the correct result. The only time we can do this is if the Lighting term is constant for the entire hemisphere, in which case we can pull it out like this:


Diffuse = Lighting * IntegrateAboutHemisphere(Visibility(L) * (N dot L)) / Pi


Which means that we can plug in the AO like so:


Diffuse = Lighting * AO


and it works! Unfortunately, the case we've constructed here isn't a very realistic one for two reasons:

  1. Very rarely is the incoming lighting constant in all directions surrounding a surface, unless perhaps if you live in The Matrix. Even for the case of a skydome you have some spots that are brighter with different hue due to sun scattering, or cloud coverage. For such cases you need to perform Visibility * Lighting inside of the integral in order to get the correct direct lighting. However, the approximation can still be fairly close to the ground truth as long as the lighting is pretty low frequency (in other words, the intensity/hue doesn't rapidly change from one direction to another). For high-frequency lighting, the approximation will be pretty poor. The absolute worse case is an infinitely small point light, which should produce perfectly hard shadows.
  2. In reality, indirect lighting will be bouncing off of the geometry surrounding the surface. For AO we basically assume that all of the surrounding geo is totally black and has no light reflecting off of it, which is of course never the case. When considering purely emissive light sources it's okay to have a visibility term like we had in our integral, however you then need to have a second integral that integrates over the hemisphere and computes indirect lighting from all other visible surfaces.

Reason #1 is why you see the general advice to only apply it to "ambient" terms, since games often partition lighting into low-frequency indirect lighting (often computed offline) that's combined with high-frequency direct lighting from analytical light sources. The "soft" AO occlusion simply doesn't look right when applied to small light sources, since the shadows should be "harder" for those cases. You also don't want to "double darken" your lighting for cases where the dynamic lights also have a dynamic shadowing term from shadow maps. 


As for #2, that's tougher. Straightforward AO will always over-darken compared to ground truth, since it doesn't account for bounce lighting. It's possible to do a PRT-style computation where you try to precompute how much light bounces off of neighboring surfaces, but that can exacerbate issues caused by non-constant lighting about the hemisphere, and also requires having access to the material properties of the surfaces. It's also typically not possible to this very well for real-time AO techniques (like SSAO), and so you generally don't see anyone doing that. Instead it's more common to have hacks like only considering close-by surfaces as occluders, or choosing a fixed AO color to fake bounced lighting.

#5214396 Litterature about GPU architecture ?

Posted by MJP on 04 March 2015 - 01:40 AM

I know that you said that you don't like slides, but have you had a look through this presentation? It's a bit out of date at this point, but still mostly relevent. The author of that presentation also wrote an article that you can read, and teaches a course on Parallel Compute Architectures that you can follow along with.

#5214395 Moving Data from CPU to a Structured Buffer

Posted by MJP on 04 March 2015 - 01:35 AM

You can't use D3D11_BIND_UNORDERED_ACCESS, since that implies that the GPU can write to the buffer.

#5214388 Moving Data from CPU to a Structured Buffer

Posted by MJP on 04 March 2015 - 12:28 AM

However, apparently dynamic resources cannot be directly bound to the pipeline as shader resources.  


What makes you say that? You can certainly do this, I've done it myself many times. See the docs here for confirmation (scroll down to the table at the bottom).

#5214386 Compute Shader - Reading and writing to the same RGBA16 texture?

Posted by MJP on 04 March 2015 - 12:23 AM

Here you go. (scroll down to the section entitled "UAV typed load")

#5213223 General Question: How Do "Kids Apps" Handle Graphics and Interaction?

Posted by MJP on 26 February 2015 - 05:48 PM

I have no particular insight into those sorts of games, but you could probably do almost everything in Flash by using Iggy or Scaleform.

#5212582 Is Unreal Engine a good start to learn rendering ?

Posted by MJP on 23 February 2015 - 06:26 PM

I'm not really familiar at all with the details of their engine, but I would imagine that it's probably a great reference for learning how to integrate rendering techniques into an engine, and probably not such a great reference for understanding the techniques themselves. I'm sure if you look at their code, you're going to find tons of code dedicated to handling the various details of their content authoring and packaging pipeline, adding gameplay/scripting hooks, and and making the output of a technique work in harmony with the other rendering features.

#5212580 Technical papers on real time rendering

Posted by MJP on 23 February 2015 - 06:22 PM

Regarding DOF/Bokeh: you should most definitely check out this presentation from CryTek (from 2013), and possibly also this 2014 presentation about COD: AW. In my opinion, the purely gather-based approaches are more scalable in terms of performance, and can give some really great results. The first presentation also has some good references to earlier papers that you can check out.

#5212394 Max size of Matrix4x4 Array in HLSL

Posted by MJP on 22 February 2015 - 11:55 PM

Isnt it a performance hit to update a texture buffer every frame with bone data?


You can request a dynamic texture from the driver, which will be optimized for the case where the CPU frequently updates the contents of the texture.



And is it a good idea to keep a texture for each model with bones?


It should be fine if you do it that way. There will definitely be some driver overhead every time that you need to update a texture, since internally the driver will use buffer renaming techniques which will require allocating you a new region of memory.