Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 29 Mar 2007
Offline Last Active Yesterday, 11:58 PM

#5315242 Multiple Point lights in Deferred Rendering

Posted by on 14 October 2016 - 01:10 PM

Just pass a light count as a variable in your constant buffer:


cbuffer LightCBuffer : register(b0)
    PointLight PointLights[MaxLights];
    uint NumPointLights;
float4 PSMain() : SV_Target0
    float3 output = 0.0f;
    for(uint i = 0; i < NumPointLights; ++i)
        output += ProcessPointLight(PointLights[i]);


The code of "sending" data to the GPU through a constant buffer is almost entirely on the CPU. The expensive part is usually mapping the buffer, which requires the driver to do some work behind the scenes. 

#5315237 d3d11 soa vertex buffer

Posted by on 14 October 2016 - 12:32 PM



1. Create 1 vertex buffer per element

2. When creating your input layout, set "InputSlot" to the index that corresponds to the vertex buffer containing that element. 

3. When it's time to draw, bind all of your vertex buffers by passing them all as an array to IASetVertexBuffers.

#5314462 HLSL: mad vs separate multiply add

Posted by on 09 October 2016 - 04:30 PM

It's similar to an SSE intrinsic where it's more explicit than using standard arithmetic operators. In other words if you use the intrinsic you're explicitly telling the compiler "please emit a mad instruction", whereas if you don't use the intrinsic it's more like "I just need you to multiply and add these numbers, and I don't really care how you do it". 


The intrinsic is useful if you want to ensure the same results from two different pieces of code. For instance if you have a vertex shader for a depth prepass then another vertex shader for a full forward pass, you'll want to make sure both shaders produce the exact same transformed positions from the same input positions (otherwise you would get Z fighting).

#5314460 HLSL Function Problem

Posted by on 09 October 2016 - 04:24 PM

The first thing you need to do is find out where your code is crashing, and then you need to figure out why it's crashing. The first question is very easy to answer with a debugger, since it will tell you the line of code (or assembly instruction if you choose to go that deep) where the crash occurred. The second question is also easy to answer with a debugger in many cases, although in some cases the root cause will not be obvious which requires more investigation. In either case you should start by reading the message reported to you by the debugger and understanding what it means. Once you've done that, look at the flow of your program so that you can determine the sequence of events that led to the error condition.


These are extremely important and basic skills for a programmer, so now is a good time to start learning them and putting them to use. :)

#5313649 What's it called when you resize an image to dimensions that are not a fa...

Posted by on 02 October 2016 - 07:01 PM

The technical term for this is "resampling", also known as "scaling". The basic process goes like this:


for each destination pixel:
   for each source pixel in some radius around the sample location:
       read source pixel
       compute reconstruction filter weight
       multiply source pixel with filter weight
       add result to running sum
   output sum as destination pixel


Like the above poster mentioned, most CPU-based graphics frameworks/libraries will have functionality for doing this. GPU's can also do it in hardware, but they only support two types of reconstruction filters (box filter AKA "point sampling", and triangle filter AKA "bilinear filtering").

#5313647 Converting resources GPU/CPU

Posted by on 02 October 2016 - 06:54 PM

FYI the Windows video memory manager will do this for you automatically. Every time the driver submits a command buffer to the GPU, it includes a list of all resources that are referenced by those commands. The OS will then try to make sure that those resources are available in GPU memory, potentially moving things to or from system memory in order to make this happen. WDDM calls this concept "residency", and in D3D12 it's actually handled manually by the application.

#5313532 What's the benefit of using specific resource format vs using DXGI_FORMAT...

Posted by on 01 October 2016 - 05:38 PM

To be clear, are you talking about creating texture resources with DXGI_FORMAT_UNKNOWN, or are you talking about buffers? To my knowledge the format is required to be DXGI_FORMAT_UNKNOWN for buffer resources, and must be a valid supported format for texture resources.


An SNORM format will use a fixed-point integer representation where every bit combination represents a valid value within the [-1.0, 1.0] range. In other words, you have 16-bits where every bit of precision is in your target range. This is not true for for FLOAT formats, where only a subset of the bit patterns represent values in the [-1.0, 1.0] range. So essentially some of your bits are "wasted" on values you'll never store.


There's also the issue of how the precision is distributed across the [-1, 1] range. With a 16-bit SNORM format, the precision is evenly distributed across the possible range: increasing by 1 always results in a difference of 1/65536. With floating-point formats this is not the case, due to the use of the exponent: your "steps" will be much smaller closer to 0, and will increase as you get further from 1. This effectively means that you'll have higher precision closer to 0 then you will closer to 1. For 16-bit floats the step size is 1/1024 at 1.0, which is equivalent to 10-bit fixed point.

#5313515 d3d11 and vsync

Posted by on 01 October 2016 - 02:05 PM

VSYNC is still relevent for windowed mode. If you use a sync interval of 1, then the GPU will wait until the display's vertical refresh to present the back buffer. If you pass 0, then it will present as quickly as possible. The one difference vs exclusive fullscreen is that the GPU is presenting the backbuffer to the desktop composition engine (DWM) instead of directly to the screen. As a result if you use a sync interval of 1, you'll see your game run at an uncapped framerate but it won't have any tearing. This is because the composition engine is still presenting with VSYNC enabled, at a rate that's decoupled from when your app presents. In exclusive fullscreen, disabling VSYNC will result in horizontal tearing.


EDIT: I should add that things changed a bit with Win 10 and the addition of the FLIP swap modes, which don't work exactly the same way as I described above See the docs for DXGI_SWAP_EFFECT for more details

#5312730 Direction to point for lighting purposes

Posted by on 26 September 2016 - 03:16 PM

That will give you a normalized direction vector pointing from the camera to the surface, in view space. You actually want a vector pointing from the surface -> light source, so you'd want to negate the vector. You would also need to make sure that the normal is converted to view space, otherwise you'll be working with vectors in two different coordinate spaces which wouldn't work. So you could do this:


float3 surfacePosVS = mul(float4(surfacePosWS, 1.0f), ViewMatrix);
float3 normalVS = mul(float4(normalWS, 0.0f), ViewMatrix);
float3 surfaceToLightVS = normalize(-surfacePosVS);
float lighting = saturate(dot(normalVS, surfaceToLightVS));


Or if you'd rather do everything in world space, you can do it like this:


float3 surfaceToLightWS = normalize(CameraPosWS - surfacePosWS);
float lighting = saturate(dot(normalWS, surfaceToLightWS));


For the second version, you just need to somehow pass your world space camera position through a constant buffer.

#5312603 Draw normal geometry and wireframe geometry in the same pass

Posted by on 25 September 2016 - 11:10 PM

This technique is pretty straightforward to implement, has high quality for interior edges, and can be done in a single pass. There's also a working demo with source code.

#5311232 Set vertex position in Domain Shader with Height Map texture

Posted by on 17 September 2016 - 05:01 PM

How the GPU interprets the texture data depends on the DXGI format used for the shader resource view. Specifically, the suffix at the end of the format (UNORM, FLOAT, etc.). You can see a full list and a description for each towards the end of this page (scroll down to "Format Modifiers"). In your case you were likely creating your texture with a UNORM format, which means that the GPU will interpret the 0-255 integer data as a 0.0-1.0 floating point value. 

#5311139 Set vertex position in Domain Shader with Height Map texture

Posted by on 16 September 2016 - 05:30 PM

You definitely want to do this in the domain shader, not the pixel shader. Have you tried using a debugging tool like RenderDoc to make sure that you've correctly bound the height map texture to the domain shader stage? RenderDoc also supports shader debugging (although not for tessellation shaders unfortunately), which you may find useful.

#5310839 How to calculate normals for deferred rendering + skinned mesh?

Posted by on 14 September 2016 - 03:20 PM

You need to apply both transforms:


float3 normal = input.Normal;
normal = mul(normal, (float3x3)SkinTransform);
normal = mul(normal, (float3x3)InverseTransposeWorld);
normal = normalize(normal);

#5310382 Color Correction

Posted by on 11 September 2016 - 05:15 PM

Usually the 3D LUT is generated by "stacking" (combining) a set of color transforms, such that the result of applying the LUT gives you the result of applying the set of transforms to an input color. So if you don't want to use a lot, you could implement the transforms directly in a pixel or compute shader and directly apply them to the input color. However a LUT may be quite a bit cheaper, depending on how many transforms you use. Another nice thing about LUT's is that your engine doesn't need to necessarily know or care about how that LUT is generated. So you might have in-engine UI for generating the LUT (like Source Engine), or have your own external tool that generates the LUT, or you might use an off-the-shelf third-party tool like Fusion, Nuke, or OpenColorIO to generate your LUT (like Uncharted 4).

#5310149 Particles in idTech 666

Posted by on 09 September 2016 - 01:16 PM

Sorry, I forgot to follow up on this. I talked to our lead FX artist, and at our studio they often just generate normal maps directly from their color maps using a tool like CrazyBump. There are a few cases where they will run a simulation in Maya, in which cases they may have Maya render out both color and normal maps to use for the particle system.