Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!

1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


Member Since 29 Mar 2007
Offline Last Active Yesterday, 06:40 PM

#5215587 Questions about GPGPU

Posted by MJP on 09 March 2015 - 11:35 PM

I should also point out that CUDA is basically C++, but missing a few things.

#5215584 Allowed format in Create View

Posted by MJP on 09 March 2015 - 11:30 PM

If you want to know which formats are supported for different uses, then go here and browse to the appropriate page for the feature level that you're targetting.

#5215073 What physical phenomenon does ambient occlusion approximate?

Posted by MJP on 06 March 2015 - 08:23 PM

I feel like it's a little unfair to just say that AO is fundamentally broken or not useful, even in the context of physically-based rendering. PBR in general is still full of approximations and weak assumptions, but the techniques are still useful and can produce more "correct" results compared to alternatives since they're still at least attempting to model real-world physical phenomena. A particularly relevent example is the geometry/shadowing term in microfacet BRDFs: it's been pointed out many times how they make the (incorrect) assumption that neighboring microfacets only occlude light instead of having light bounce off them, and yet it's still better than having no occlusion term at all. It's also a much more feasible solution compared to something crazy like path tracing at a microfacet level, so you can also just look at it as a reasonable trade-off given performance constraints. I feel like you can look at AO the same way: sure it makes assumptions that are usually wrong, but if used judiciously it could absolutely give you better results as opposed to not having it. For instance, representing occlusion from dynamic objects and characters. Sure you'd get better results if you simulated local light bouncing off those objects, but it's a heck of a lot cheaper to just try to capture their occlusion in AO form and can still be a lot better than having no occlusion whatsoever.

#5214400 What physical phenomenon does ambient occlusion approximate?

Posted by MJP on 04 March 2015 - 02:10 AM

The short version: AO approximates the occlusion of direct lighting from a distant hemispherical light source


The long version:


AO essentially approximates the shadowing/visibility that you would use for computing diffuse lighting from a distant light source that covers the entire hemisphere that's visible for a surface (hence Hodgman's comment about sky lighting). It actually works fairly well for the case where you have an infinitely distant light source that has a constant incoming lighting for every direction on the hemisphere. To understand why, let's look at the integral for computing AO:


AO = IntegrateAboutHemisphere(Visibility(L) * (N dot L)) / Pi


where "Visibility(L)" is a visibility term that returns 0.0 if a ray intersects geometry in the direction of 'L', and 1.0 otherwise. Note that the "1 / Pi" bit is there because the integral of the cosine term (the N dot L part) comes out to Pi, and so we must divide by Pi to get a result that's of the range [0, 1].


Now let's look at the integral for computing direct diffuse lighting (no indirect bounce) from an infinitely distant light source (like the sky), assuming a diffuse albedo of 1:


Diffuse = IntegrateAboutHemisphere(Visibility(L) * Lighting(L) * (N dot L)) / Pi


where "Lighting(L)" is the sky lighting in that direction. 


When we use AO for sky light, we typically do the equivalent of this:


Diffuse = AO * IntegrateAboutHemisphere(Lighting(L) * (N dot L)) / Pi,


which if we substitute in the AO equation gives us this:


Diffuse = (IntegrateAboutHemisphere(Visibility(L) * (N dot L)) / Pi) * (IntegrateAboutHemisphere(Lighting(L) * (N dot L)) / Pi)


Unfortunately this is not the same integral that we used earlier for computing direct lighting, since you generally can't just pull out terms from an integral like that and get the correct result. The only time we can do this is if the Lighting term is constant for the entire hemisphere, in which case we can pull it out like this:


Diffuse = Lighting * IntegrateAboutHemisphere(Visibility(L) * (N dot L)) / Pi


Which means that we can plug in the AO like so:


Diffuse = Lighting * AO


and it works! Unfortunately, the case we've constructed here isn't a very realistic one for two reasons:

  1. Very rarely is the incoming lighting constant in all directions surrounding a surface, unless perhaps if you live in The Matrix. Even for the case of a skydome you have some spots that are brighter with different hue due to sun scattering, or cloud coverage. For such cases you need to perform Visibility * Lighting inside of the integral in order to get the correct direct lighting. However, the approximation can still be fairly close to the ground truth as long as the lighting is pretty low frequency (in other words, the intensity/hue doesn't rapidly change from one direction to another). For high-frequency lighting, the approximation will be pretty poor. The absolute worse case is an infinitely small point light, which should produce perfectly hard shadows.
  2. In reality, indirect lighting will be bouncing off of the geometry surrounding the surface. For AO we basically assume that all of the surrounding geo is totally black and has no light reflecting off of it, which is of course never the case. When considering purely emissive light sources it's okay to have a visibility term like we had in our integral, however you then need to have a second integral that integrates over the hemisphere and computes indirect lighting from all other visible surfaces.

Reason #1 is why you see the general advice to only apply it to "ambient" terms, since games often partition lighting into low-frequency indirect lighting (often computed offline) that's combined with high-frequency direct lighting from analytical light sources. The "soft" AO occlusion simply doesn't look right when applied to small light sources, since the shadows should be "harder" for those cases. You also don't want to "double darken" your lighting for cases where the dynamic lights also have a dynamic shadowing term from shadow maps. 


As for #2, that's tougher. Straightforward AO will always over-darken compared to ground truth, since it doesn't account for bounce lighting. It's possible to do a PRT-style computation where you try to precompute how much light bounces off of neighboring surfaces, but that can exacerbate issues caused by non-constant lighting about the hemisphere, and also requires having access to the material properties of the surfaces. It's also typically not possible to this very well for real-time AO techniques (like SSAO), and so you generally don't see anyone doing that. Instead it's more common to have hacks like only considering close-by surfaces as occluders, or choosing a fixed AO color to fake bounced lighting.

#5214396 Litterature about GPU architecture ?

Posted by MJP on 04 March 2015 - 01:40 AM

I know that you said that you don't like slides, but have you had a look through this presentation? It's a bit out of date at this point, but still mostly relevent. The author of that presentation also wrote an article that you can read, and teaches a course on Parallel Compute Architectures that you can follow along with.

#5214395 Moving Data from CPU to a Structured Buffer

Posted by MJP on 04 March 2015 - 01:35 AM

You can't use D3D11_BIND_UNORDERED_ACCESS, since that implies that the GPU can write to the buffer.

#5214388 Moving Data from CPU to a Structured Buffer

Posted by MJP on 04 March 2015 - 12:28 AM

However, apparently dynamic resources cannot be directly bound to the pipeline as shader resources.  


What makes you say that? You can certainly do this, I've done it myself many times. See the docs here for confirmation (scroll down to the table at the bottom).

#5214386 Compute Shader - Reading and writing to the same RGBA16 texture?

Posted by MJP on 04 March 2015 - 12:23 AM

Here you go. (scroll down to the section entitled "UAV typed load")

#5213223 General Question: How Do "Kids Apps" Handle Graphics and Interaction?

Posted by MJP on 26 February 2015 - 05:48 PM

I have no particular insight into those sorts of games, but you could probably do almost everything in Flash by using Iggy or Scaleform.

#5212582 Is Unreal Engine a good start to learn rendering ?

Posted by MJP on 23 February 2015 - 06:26 PM

I'm not really familiar at all with the details of their engine, but I would imagine that it's probably a great reference for learning how to integrate rendering techniques into an engine, and probably not such a great reference for understanding the techniques themselves. I'm sure if you look at their code, you're going to find tons of code dedicated to handling the various details of their content authoring and packaging pipeline, adding gameplay/scripting hooks, and and making the output of a technique work in harmony with the other rendering features.

#5212580 Technical papers on real time rendering

Posted by MJP on 23 February 2015 - 06:22 PM

Regarding DOF/Bokeh: you should most definitely check out this presentation from CryTek (from 2013), and possibly also this 2014 presentation about COD: AW. In my opinion, the purely gather-based approaches are more scalable in terms of performance, and can give some really great results. The first presentation also has some good references to earlier papers that you can check out.

#5212394 Max size of Matrix4x4 Array in HLSL

Posted by MJP on 22 February 2015 - 11:55 PM

Isnt it a performance hit to update a texture buffer every frame with bone data?


You can request a dynamic texture from the driver, which will be optimized for the case where the CPU frequently updates the contents of the texture.



And is it a good idea to keep a texture for each model with bones?


It should be fine if you do it that way. There will definitely be some driver overhead every time that you need to update a texture, since internally the driver will use buffer renaming techniques which will require allocating you a new region of memory.

#5212348 Deferred Shading

Posted by MJP on 22 February 2015 - 05:40 PM

In the console and mid-to-high-end PC space, deferred rendering is ubiquitous. Almost every game uses some variant of deferred rendering, of which there's a few to choose from. They all make different tradeoffs in terms of how big of a G-Buffer is used, how lights are assigned, etc. But in pretty much all cases they follow the general pattern of allowing the GPU to decide how lights get assigned to surfaces as opposed to doing on the CPU, or doing it offline.

#5212234 Max size of Matrix4x4 Array in HLSL

Posted by MJP on 22 February 2015 - 03:11 AM

If I recall correctly, vs_2_0 only guarantees a minimum of 256 constant registers. So if each float4x4 uses 4 constant registers, then you would be able to safely use a maximum of 64 bone matrices if you have no other constants in your shader. But since you mentioned that you have some other variables, I'm guessing you do have some other constants and that's why you need to use less than 64 matrices.


FYI, the reason that the compiler doesn't complain about you going over the limit is because it's actually a runtime issue. D3D9 lets the driver specify the actual maximum number of vertex shader constants, which you can query via D3DCAPS9.MaxVertexShaderConst. However in practice, I'm pretty sure that almost every driver just sets that to 256.


If you want to get around that limit and you're targeting semi-recent (DX10 era or newer) hardware, then you can read your bone matrices from a texture instead of using shader constants.

#5211063 BRDF Shading Diffuse + Specular

Posted by MJP on 16 February 2015 - 04:32 PM

Also, your "Diffuse" variable represents the diffuse albedo of your surface. So you don't want to just add it to your lighting result, you want to multiply it with irradiance incident on the surface. Your final lambertian diffuse reflectance term should be this:


float3 DiffuseReflectance = (Diffuse / Pi) * Irradiance;


That "Irradiance" term can either be the irradiance computed from an analytical light source, or it could be irradiance from the entire environment. In your case you're using image based lighting from a cubemap, and so to get the irradiance you'll want to integrate irradiance for all possible normal directions. It's possible to do this and store the results in a cubemap, or alternatively you can use spherical harmonics.