Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 03 Jul 2006
Offline Last Active Private

Posts I've Made

In Topic: Intrinsics to improve performance of interpolation / mix functions

17 September 2014 - 01:52 PM

This is the most efficient option I could find. Two instructions to compute the value to test, by rearranging it so you do the subtract first. Unless I've messed up somewhere it should give you the same answer :)

    float test = dot(col2-col1, float3(1.0f / 3.0f, 1.0f / 3.0f, 1.0f / 3.0f));
    if (test > threshold) { ... }

In Topic: HLSL compiler weird performance behavior

08 September 2014 - 06:18 PM

I had a quick play with the shader, and found compilation went much faster if I:


1. Manually inlined the function call (this had the biggest impact).

2. Used the [fastopt] attribute on the loop.

3. Instead of #2 disabled optimization completely in the shader compiler for a bigger impact.


Note that [fastopt] can make the compiler generate worse code, so I wouldn't recommend it outside of prototyping. The same goes for disabling optimization on the shader compiler. Having said that the driver optimizes the shader too, so the runtime performance hit from either of those isn't usually very big.


As a side note, you can generally get away with 4x3 matrices for your bones, which cuts down on the size of the constant buffer and saves a few instructions in the shader.

In Topic: Intrinsics to improve performance of interpolation / mix functions

05 September 2014 - 06:09 PM

There's a standard trick to improve the performance of blurs by using the bilinear filtering hardware to half the number of texture fetch instructions required.



In Topic: How to set Alpha value from pixel shader in SlimDX Direct3d9

01 September 2014 - 04:55 PM

It looks like the posted code isn't setting any of the alpha blend states, and I think the default is to not blend.

In Topic: copying back buffer not working

15 August 2014 - 04:57 PM

You can implement split screen without doing any copying of render targets.


The way you to that is to use viewports - the default viewport you get when you set a render target is a full size one, but you can change that to cover a different portion of the render target to do things like split screen.


The main advantage of that approach is that it will perform better than copying. Especially if your copies go back and forth to system memory instead of staying on the GPU.


If you really need to copy bits of render target around, I'd suggest using StretchRect() with the help of GetSurfaceLevel(). That will avoid the copying to and from system memory.