Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 29 Mar 2007
Offline Last Active Today, 07:19 PM

#5060758 Phong shading issue in directx

Posted by MJP on 09 May 2013 - 10:23 PM

Are you sure that your vertex normals are smooth? Try just using vertex normal as the RGB values of the color returned by your pixel shader, and make sure that they look smooth.

#5060408 Problem with D3DXComputeTangentFrame

Posted by MJP on 08 May 2013 - 03:59 PM

In certain cases vertices will need to be split in order to generate consistent tangent frames for all triangles. The most common and easy-to-understand case is when UV's are mirrored across a vertex. Here's a crude diagram showing what I mean:


(0, 0) ============ (1, 0) ============= (0, 0)

 |                    |                     |

 |                    |                     |

 |                    |                     |

 |                    |                     |

 |                    |                     |

 |                    |                     |

(0, 1) ============ (1, 1) ============= (0, 1)


So here we have two quads using 6 verts, with UV coordinates shown for each vertex. To generate tangents and bitangents for these vertices, we look at how the UV coordinates are changing going from one vertex to the next. The bitangent is simple in this case: it's always increasing in the downward direction, so the bitangent will be (0, -1, 0) for all 6 verts. However the tangent is not so simple. This is because the U coordinate increases going from the left verts to the middle verts, but then decreases again going from the middle vert to the right vert. So for the leftmost verts you want a tangent of (1, 0, 0), and for the rightmost vert you want a tangent of (-1, 0, 0). For the middle verts however there is no tangent value we can use that will work correctly with both the left quad and the right quad. To deal with this we have to split the middle verts, so that you have one with  a tangent of (1, 0, 0) for the left quad and one with a tangent of (-1, 0, 0) for the right quad.

D3DXComputeTangentFrame will not split vertices, since it tries to put the tangent/bitangent data in the same mesh that it's using as an input. If you need to split verts, then you need to use D3DXComputeTangentFrameEx which can output a different mesh than the input mesh with a different number of vertices.

#5060155 StructuredBuffer vs Buffer

Posted by MJP on 07 May 2013 - 06:43 PM

For your second example, it depends on your access parents. If you frequently access f without using i, then it's probably best to separate them. If you always access them together, then it's probably best to keep them together.

#5059369 Why is windows 7 deleting my files?

Posted by MJP on 05 May 2013 - 12:34 AM

On Vista/Win7/Win8 users normally run with lowered privileges, which can be temporarily elevated using a UAC prompt. This changed from XP, where by default users had full admin privileges for everything. They did this so that programs don't gain full admin access unless the user is specifically allows it, in order to prevent malicious programs from doing whatever they want. Since you're complaining about this, I'll note that this is standard on Unix-like systems where users don't normally just login as root and will instead use the su command to temporarily elevate privileges. 

Anyway the point is that you can't write to the Program Files directory without admin-level privileges, which normally requires a UAC elevation prompt. So you don't want to use it to store settings or temporary files. Instead you're supposed to store per-user data in FOLDERID_RoamingAppData (usually located in C:\Users\username\AppData\Roaming), and non-user-specific data in FOLDERID_ProgramData (usually located in C:\ProgramData). You can read about these special folders here.

#5059368 Pixel shader conditional if

Posted by MJP on 05 May 2013 - 12:17 AM

HLSL actually has the "any" intrinsic, which returns true if any of the components of a vector are non-zero. So you could use it like this:


float4 color = tex2D(DiffuseSampler, IN.tex0);
if (any(color))
     return IN.shader * color * materialColor;
    return IN.shade * materialColor;

#5059358 Simple shadow map antialiasing?

Posted by MJP on 04 May 2013 - 11:48 PM

Well, it's actually a form of filtering but it will have the result of "softening" your shadows by giving them a penumbra. Filtering is definitely a good way to reduce aliasing (the jagged, stair-step artifacts that you're talking about), but it will also reduce the sharp details. If you want really sharp details without aliasing, then the only good solution is to increase the effective resolution of your shadow map (either by increasing the size of the shadow map, or reducing the amount of screen space that it covers). High shadow map resolution + filtering will give you sharp, unaliased shadows.

#5059307 Simple shadow map antialiasing?

Posted by MJP on 04 May 2013 - 05:25 PM

The standard technique is called Percentage Closer Filtering, or PCF for short. It basically amounts to sampling the shadow map multiple times in a small radius, performing the shadow comparison for each sample, and averaging the result. Which version of D3D are you using?

#5059091 D3DXComputeTangentFrame output?

Posted by MJP on 04 May 2013 - 12:12 AM

^^^ what Nik02 said. If you want to see an example, I'm pretty sure that the old ParallaxOcclusionMapping sample from the DirectX SDK can show you how to load a mesh, clone it, and generate a tangent frame.

#5059090 Performance improvements from Dx9 to Dx10/11

Posted by MJP on 04 May 2013 - 12:10 AM

"Old cruft" would mostly refer to the fixed-function pipeline and everything that went with it.


As for "more functionality", yes this would mostly be the things enabled by the higher feature levels. So things like new shader types, new resource types (texture arrays, structured buffers, append buffers), access to MSAA sample data, new texture sampling functions (Gather, SampleCmp), integer instructions, new render target and compressed texture formats, etc.

#5059053 Performance improvements from Dx9 to Dx10/11

Posted by MJP on 03 May 2013 - 05:00 PM

That's a pretty weird question to ask at an interview, IMO. Off the top of my head...

  • Changed the driver model so that the driver has a user-mode DLL in between the D3D runtime DLL and the kernel-mode driver DLL
  • Changed API's so that you can set multiple textures/buffers/render targets in a single call instead of one at a time
  • Removed a lot of old cruft that was being emulated on GPU's, or not supported at all
  • Introduced constant buffers, which can be much faster to update compared to the old constant registers used in D3D9
  • Moved device states into immutable state objects. This results in less API calls, and allows validation to be done once during initialization.
  • Input layouts allow the driver to create fetch shaders earlier, and re-use them more efficiently
  • D3D11 added multithreaded rendering, which can potentially allow you to improve total performance if you have a few cores to spare.

These are all of the changes I can think of that relate to CPU performance of the API itself, as opposed to performance gains you might get on the GPU due to having access to more functionality.

#5058202 How does VSYNC work?

Posted by MJP on 30 April 2013 - 05:18 PM

I would suggest reading up on raster display devices. A lot of the terminology comes from the days of CRT displays, but most of it still applies to LCD displays.


The display adapter (on your video card) is configured to send frame information at a set refresh rate. For normal windows desktop usage this refresh rate is configured as part of your display settings, for fullscreen D3D applications the refresh rate is specified when you create a swap chain for an adapter. Once the refresh rate is configured, the adapter will send frames to the display at the agreed upon rate. The data is sent one scan line at a time, until the the bottom line is sent and the image is complete. Once complete, the display is now in its VBLANK period. The scanline data that's sent to the display comes from an area of memory on the video card called the frame buffer. Typically you have 2 frame buffers for a double buffering setup, where one buffer (the back buffer) is written to by the GPU while the other (the front buffer) is being read and sent to the display. At the end of each frame once you've finished rendering the back buffer, you'll "flip" the buffers and the back buffer will become the front buffer (and vice versa). If this flip happens during the VBLANK, everything is fine: you'll see one image for an entire refresh period of the display and then you'll see the next image for the next refresh period. However if you flip outside of the VBLANK while the scanlines are being sent to the display, you'll get a "tear". This is because some of the lines from the previous frame will have been sent to the display, and then suddenly the scanlines switch to the next frame.


Now what VSYNC does is it causes the GPU to wait until the VBLANK period before flipping. This ensures you have no tearing, since you always flip in the VBLANK. The downside is that you might have to wait a while for the next VBLANK, especially if you just missed it. So for instance if your refresh rate is 100Hz your VBLANK will be every 10ms. So if it takes you 10.1ms to finish rendering a frame, you'll have to wait until the next VBLANK which means it will have effectively taken you 20ms to display that frame. If VSYNC had been disabled, you would just get a tear at the top of the screen instead. This is why frame dips are more jarring when VSYNC is enabled.

When you're not in fullscreen mode, things are different because the your D3D/OpenGL adapter won't have exclusive access to the display. Instead it might go through a desktop manager, depending on the OS you're using. On Windows Vista/7/8 with desktop composition enabled, the GPU will present the back buffer to the DWM instead of the display, and then the DWM will composite the image with the rest of the desktop. This is why you don't get tearing in windowed mode, even if VSYNC is disabled. Enabling VSYNC will still limit your app to the refresh rate, however the behavior will be different.

#5057675 Performance between half and float under SM3.0

Posted by MJP on 29 April 2013 - 01:24 AM

The only GPU's that ever supported half precision in shaders were Nvidia's FX series, 6000 series, and 7000 series GPU's. On the FX series using half-precision was actually critical for achieving good performance, since full precision came with a significant performance penalty. ATI hardware used a weird 24-bit precision internally for everything on their early DX9 hardware, since the spec for SM2.0 was somewhat loose in terms of how it defined the precision and format of floating-point operations. Later ATI DX9 hardware used full 32-bit precision for everything, since SM3.0 required IEEE compliance (or at least something much closer to it).

For SM4.0, the half-precision instructions and registers were completely removed from the specification. Using the "half" type in HLSL will cause the compiler to use full-precision instructions, and in practice no DX10 or DX11 GPU's support half-precision arithmetic internally. Weirdly enough lower-precision instructions have made a comeback in D3D11.1, primarily for mobile hardware. However in 11.1 the syntax for using it is different, you have to use types like "min16float" and "min16uint".

#5056837 How much to tessellate?

Posted by MJP on 26 April 2013 - 12:02 AM

You might be able to just directly call D3DCompile from that DLL using p/invoke.

#5056708 Ugly lines on screen.

Posted by MJP on 25 April 2013 - 12:27 PM

Size of backbuffer is the same as window, and model was created in blender (smooth normals), and this model looked great in OpenGL smile.png I also tried VSync (off/on), didnt help.


Is it the same size as the window, or the same size as the client area of the window?

It's not a model issue, it's an artifact resulting from upscaling or downscaling an image with point filtering. This is what happens when you create your backbuffer with a different size than the window's client area, which is why I asked about it.

#5056482 How much to tessellate?

Posted by MJP on 24 April 2013 - 04:20 PM

At work we used to do something very similar to what's described in this presentation, back when we were planning on using a lot of tessellation. It seemed to work well enough, although it was never battle-tested in a production setting.