Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 29 Mar 2007
Online Last Active Today, 04:37 PM

#5076706 Global illumination techniques

Posted by on 10 July 2013 - 03:06 PM


Hmm, you're right. It looks like they're using cascaded shadow maps for both the static and dynamic geometry, which is interesting. I assume they bake only the indirect lighting and then just add in the direct lighting on the fly. If nothing else, it's probably easier to implement than storing the contribution of direct light onto static geometry.


Guys, I understand the part with shadows. It's not interesting if they are using static shadow maps for static level geometry. I don't think they just bake the indirect lighting and that's it. The actors and other objects moving through the level receive indirect lighting as well. I have a feeling they have some sort of lightmap on static levels and also have some "fill lights" placed here and there simulate bounced light and to illuminate dynamic objects, that move around.



It's fairly common to bake ambient lighting into probes located throughout the level, and then have dynamic objects sample from those probes as they move through the level.

#5076541 Are the D3DX Functions Faster ?

Posted by on 10 July 2013 - 01:13 AM

If you're not experienced at writing a math library, then don't expect to write one that's going to be better than D3DX. It's more likely you'll end up with something slower and buggier.

I'll also point out that there's a newer replacement for D3DX math, called DirectXMath.

#5076283 C++ DX API, help me get it?

Posted by on 08 July 2013 - 11:17 PM

1) A lot of the setup in the simple tutorials is just getting a window going. That's just the way it is in Win32: it's a crusty old C API that takes a lot of boilerplate just for basic functionality. There's no reason not to use a framework to handle this for you, whether it's an existing one like SDL or one of your own design.


2) It's inherited from the Windows API. Almost all API's do something like this, since C/C++ only makes loose guarantees about the size of types. Using typedefs can ensure that the API is always working with the same size type. This isn't really much of an issue for x86 Windows platforms, so you can usually ignore them and using native types or the types from stdint.h


3) D3D uses a light-weight form of COM. Supporting COM requires C compatibility, which means you're mostly limited to functionality exposed in C (before C99). This is why  you have things like pointers instead of references for function parameters, and structs with no methods. However there are actually "C++" versions of some of the structures that have constructors defined. They're named the same with a "C" in front of them, for instance CD3D11_TEXTURE2D_DESC. It's also possible to make your own extension structures if you want.

4) Mostly because the older samples are written in more of a "C with classes" style as opposed to modern C++. The newer samples written for the Windows 8 SDK actually make heavy use of exceptions, smart pointers, and C++11 features. In my home codebase I just made a wrapper macro for D3D calls that checks the HRESULT for failure, and if it fails converts it to a string error message and stuffs it in an exception.

5) This is indeed the same reasoning as #3. It can definitely be pretty ugly at times.

6) Also the same as #3

7) Yeah that stuff is rooted in the Win32 API, and it's seriously ugly. I never use the typedefs in my own code.

8) This comes from D3D being a COM API. DXGI actually happens to be even heavier on the COM compared to D3D, hence it taking the interface GUID as a function parameter. However I'm pretty sure you don't have to use __uuidof if you don't want, it's just a convenience.

The reason SharpDX doesn't "feel" the same is because they wrap all of this in types and functions that convert the COM/Win32 idioms into patterns that are typical to C#. You can certainly do the same except with modern C++ concepts, if that's how you'd like to do it.

#5075565 What is the difference between DXGI_SWAP_EFFECT_DISCARD and DXGI_SWAP_EFFECT_...

Posted by on 05 July 2013 - 04:05 PM

It's pretty simple. If you use DISCARD, then as soon as you call Present the contents of the backbuffer are wiped away. If you use SEQUENTIAL, then the contents of the back buffer remain after calling Present. The order of what you see on the screen is the same in both modes, it's always the same order in which you call Present.

As for your refresh rate question, that depends on what you pass as the SyncInterval parameter of IDXGISwapChain::Present. If you pass 0, then the device never waits and always presents to the screen as soon as the GPU is ready to do so. If you happen to present outside of the VBLANK period then you will get tearing. If you pass 1, then the device waits until the next VBLANK period to flip buffers. So in your 90fps scenario, the device would then effectively be locked at 60Hz since that's the fastest that the display can output to the screen. If you pass 2, then the device waits for the 2nd VBLANK period which would cap you at 30Hz.

#5075108 HLSL Shader Library

Posted by on 03 July 2013 - 02:42 PM

At this point shaders are generic programs that you run on the GPU. It's like asking for a C++ code library.

For any non-simple renderer the shader code will completely depend on the overall architecture of the renderer, and not just the visual appearance of whatever you're drawing. In fact a lot of modern shaders don't draw anything at all!

#5075099 Yet another Deferred Shading / Anti-aliasing discussion...

Posted by on 03 July 2013 - 02:26 PM

FXAA is good in that it's really easy to implement and it's really cheap, the quality is not great. It has limited information to work with, and is completely incapable of handling temporal issues due to lack of sub-pixel information. If you use it you definitely want to do as ic0de recommends and grab the shader code and insert it into your post-processing chain as opposed to letting the driver do it, so that you can avoid applying it to things like text and UI. There's also MLAA which has similar benefits and problems.

You are correct that the "running the shader per-pixel" bit of MSAA only works for writing out your G-Buffer. The trick is to use some method of figuring out which pixels actually have different G-Buffer values in them, and then apply per-sample lighting only to those pixels while applying per-pixel lighting to the rest. For deferred renderers that use fragment shaders and lighting volumes, the typical way to do this is to generate a stencil mask and draw each light twice: once with a fragment shader that uses per-pixel lighting, and once with a fragment shader that uses per-sample lighting. For tiled compute shader deferred renderers you can instead "bucket" per-sample pixels into a list that you build in thread group shared memory, and handle them separately after shading the first sample of all pixels.

Some links:






I also wrote quite a bit about this in the deferred rendering chapter of the book that I worked on, and wrote some companion samples that you can find on CodePlex.


Deferred lighting, AKA light pre-pass is basically dead at this point. It's only really useful if you want to avoid using multiple render targets, which was desirable on a particular current-gen console. If MRT isn't an issue then it will only make things worse for you, especially with regards to MSAA.

TXAA is just an extension of MSAA, so you need to get MSAA working before considering a similar approach. Same with SMAA, which basically combines MSAA and MLAA.

Forward rendering is actually making a big comeback in the form of "Forward+", which is essentially a modern variant of light indexed deferred rendering. Basically you use a compute shader to write out a list of lights that affect each screen-space tile (usually 16x16 pixels or so) and then during your forward rendering pass each pixel walks the list and applies each light. When you do this MSAA still works the way it's supposed to, at least for the main rendering pass. If you search around you'll find some info and some sample code.

As for the G-Buffer, as small as you can make it is still the rule of thumb. In general some packing/unpacking shader code is worth being able to use a smaller texture format. Reconstructing position from depth is absolutely the way to go, since it lets you save 3 G-Buffer channels. Storing position in a G-Buffer can also give you precision problems, unless you go for full 32-bit floats.

#5075093 Bilateral Blur with linear depth?

Posted by on 03 July 2013 - 02:02 PM

You can linearize a sample from a depth buffer with just a tiny bit of math, using some values from your projection matrix:


float linearZ = Projection._43 / (zw - Projection._33);


That's using HLSL matrix syntax, you would have to convert that to the appropriate GLSL syntax.

#5074853 Shouldn't the vector that is multipled with the projection matrix be 4D?

Posted by on 02 July 2013 - 03:16 PM


However,if I pass a 3D vector,will DirectX just add the 4th component to it or...? If there is no w,where is the z coppied?


It depends on the math library that you're using, and which function you're using. Both the D3DX and DirectXMath libraries have 2 different vector/matrix transformation functions: one that uses 0 as the W component, and one that uses 1. The functions that end with "Coord" use 1 as the W component, and the functions that end with "Normal" use 0 as the W component.

EDIT: actually let me correct that, there are 3 functions:


D3DXVec3Transform/XMVector3Transform - this uses 1 as a the W component, and returns a 4D vector containing the result of the multiplication

D3DXVec3TransformCoord/XMVector3TransformCoord - this uses 1 as a the W component, and returns a 3D vector containing the XYZ result divided by the W result

D3DXVec3TransformNormal/XMVector3TransformNormal - this uses 0 as a the W component, and returns a 3D vector containing the XYZ result

#5074373 Changing LoD on instancing and alpha testing

Posted by on 01 July 2013 - 01:51 AM

What you basically want is the alpha to be a probability that a pixel is opaque. I would do it like this:

float random = tex2D(g_dissolvemap_sampler, a_texcoord0).r;
if(alpha < random)

You can also use alpha-to-coverage, which will basically accomplish the same thing using a tiled screen-space dither pattern instead of a pure random pattern. You can also  encode a set of dither patterns directly into a texture, and then lookup into that texture based on screen position and alpha value.

#5074361 Global illumination techniques

Posted by on 01 July 2013 - 01:09 AM

I believe that voxel cone tracing is state of the art if you want to do real time GI. I think Crytek and Unreal 4 have it, though I'm not sure if anybody's shipped an actual game with it yet.


Epic has since moved away from it, they're using pre-baked lightmaps and specular probes now. Crytek was using an entirely different technique (Cascaded Light Propogation Volumes) which has a different set of tradeoffs. They shipped it for the PC version of Crysis 2 but not the console version, and I'm not sure if they used it in Crysis 3.

#5074359 Decals with deferred renderer

Posted by on 01 July 2013 - 01:04 AM

DX10 feature level requires full, independent blending support for multiple render targets. So any DX10-capable GPU should support blending and color write control for MRT's, assuming that the driver enables it for D3D9.

#5074208 Global illumination techniques

Posted by on 30 June 2013 - 12:45 PM

Just about all games with GI bake it to lightmaps, and this includes The Last of Us (although The Last of Us does have dynamic GI from your flashlight that's only enabled in a few indoor locations). Very few games have a runtime GI component, since existing techniques are typically expensive and don't end up looking as good as a static-baked result. Some games with deferred renderers try to get away with no GI at all, and just use some runtime or baked AO combined with lots of artist-placed "bounce lights" or "ambient lights" that try to fake GI.

#5073153 So, Direct3D 11.2 is coming :O

Posted by on 27 June 2013 - 12:19 AM

Input assembler moved completely into the vertex shader.  You bind resources of pretty much any type to the vertex shader, access them directly via texture look-ups.  Would make things a lot simpler and more flexible IMHO.  Granted you sort-of can do this already, but I'd be nice if the GPUs/drivers were optimized for it.


GPU's already work this way. The driver generates a small bit of shader code that runs before the vertex shader (AMD calls it a fetch shader), and all it does is load data out of the vertex buffer and dump it into registers. If you did it all yourself in the vertex shader there's not really reason for it to be any slower.


Depth/stencil/blend stage moved completely into the pixel shader.  Sort of like UAVs but not necessarily with the ability to do 'scatter' operations.  Could be exposed by allowing 'SV_Target0', 'SV_Target1' ect... to be read and write.  So initially its loaded with the value of the target, and it can be read, compared, operated on, and then if necessary written.


Programmable blending isn't happening without completely changing the way desktop GPU's handle pixel shader writes. TBDR's can do it since they work with an on-chip cache, but they can't really do arbitrary numbers of render targets.

Doing depth/stencil in the pixel shader deprives you of a major optimization opportunity. It would be like always writing to SV_Depth.

#5072884 DX11 - Multiple Render Targets

Posted by on 25 June 2013 - 06:09 PM

Use PIX or VS 2012 Graphics Debugger to inspect device state at the time of the draw call, it will tell you what render targets are currently bound.

#5072305 about Shader Arrays, Reflection and BindCount

Posted by on 23 June 2013 - 02:14 PM

Arrays of textures in shaders are really just a syntactical convenience, the underlying shader assembly doesn't actually have any support for them so the compiler just turns the array into N separate resource bindings. So it doesn't surprise me that the reflection interface would report it as N separate resource bindings, since that's essentially how you have to treat it on the C++ side of things.

It does seem weird that the BIND_DESC structure has a BindCount fieldthat suggests it would be used for cases like these, but I suppose it doesn't actually work that way. I wonder if that field actually gets used for anything.