Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 29 Mar 2007
Offline Last Active Yesterday, 11:44 PM

#4931275 Unable to create full-screen direct3d9Ex device

Posted by MJP on 14 April 2012 - 04:01 PM

You can turn on the debug runtimes to get more information about a failing D3D call. You can turn it on in the DirectX control panel, and debug messages will get output to the native debugging output stream. Since it's the native debug output you have to have native debugging enabled, or use a program like DebugView to see the messages.

#4931274 [HLSL] Half vectors in shaders?

Posted by MJP on 14 April 2012 - 03:56 PM

I'm not entirely sure of how to get the data in the shader. If it were normal floats and vectors, I would have no problem, but it's not. Would it be alright to write a shader as if it's going to be accessing normal float data, or will I need to use half types, and somehow convert them to float?

You can use float types, and the GPU will automatically convert them to 32-bit float for the vertex shader.

So what you're saying is that for the above structure that I provided, I could write the following in HLSL?
struct VoxVertexIn
float4 Position : POSITION;
float4 Normal : NORMAL;
float2 UV : TEXCOORD;
float4 Color : COLOR;

Yes, that will work.

#4931248 [HLSL] Half vectors in shaders?

Posted by MJP on 14 April 2012 - 01:45 PM

I'm not entirely sure of how to get the data in the shader. If it were normal floats and vectors, I would have no problem, but it's not. Would it be alright to write a shader as if it's going to be accessing normal float data, or will I need to use half types, and somehow convert them to float?

You can use float types, and the GPU will automatically convert them to 32-bit float for the vertex shader.

#4931054 GLSL: how to determine uv sampling interval?

Posted by MJP on 13 April 2012 - 03:02 PM

dFdx and dFdy are what you're looking for. These will give you the rate of change of a variable in screen space. So for instance if your texture U coordinate is 0.2 at pixel (0,0) and it's 0.3 at (1, 0), then calling dFdx will give you a value of 0.1 since that's the difference in the X direction.

#4931050 Partial Alpha / Blend problems

Posted by MJP on 13 April 2012 - 02:57 PM

BC1 only provides you with 1 bit of alpha, which is why you get that weird "stenciled out" look with that format. BC2 and BC3 both allow for a full range of alpha values, so you should get much better results with either of them. Otherwise if you don't want to use compression R8G8B8A8_UNORM will give you sufficient precision for a .PNG file.

#4929677 A basic questions about shader resources (textures)

Posted by MJP on 09 April 2012 - 03:22 PM

You don't have to set unused slots back to NULL. It can make things easier when you're debugging in PIX, and it can also make it easier to avoid read/write conflicts that occur when you have a resource bound as both an input and an output. But in general it's not necessary, and I don't think there's much performance difference.

#4929238 warning X4714: ~'excessive temp registers'

Posted by MJP on 08 April 2012 - 12:47 AM

Yeah the compiler is extremely aggressive with optimizations, since all of the code run by a shader is compiled simultaneously (expect in the case of dynamic shader linkage) and also because the HLSL language is pretty simple. This also means that function calls are pretty much always inlined, so using them typically has no effect on the resulting assembly. So you're free to organize your functions and #include's however you'd like, and you'll always end up with a tightly-optimized shader.

#4929100 warning X4714: ~'excessive temp registers'

Posted by MJP on 07 April 2012 - 11:43 AM

WTF. What makes that line so special, and how can I fix this?

Well when you comment out that line you're also going to cause the compiler to strip out all instructions needed to generate the data for that line. So in your case, all of the code in your other functions for getting the normal of the intersecting surface will get stripped out.

#4927110 Revival of Forward Rending?

Posted by MJP on 31 March 2012 - 11:52 PM

Sorry for the bump, but I put up a blog post with some performance numbers and a sample app. Feel free to use it for your own experiments.

#4926757 Constant buffer UpdateSubResource vs Map

Posted by MJP on 30 March 2012 - 12:38 PM

Nvidia recently recommended using Map + DISCARD for updating constant buffers, and said that their driver is capable of handling a very large number of small constant buffers. In my experience this seems to be good advice...I've used this approach for updating lots of constant buffers and it seems to work just fine. If you try to UpdateSubResource on a resource that is in use, then I believe the data will end up getting copied into the command buffer so that it can be later copied into the resource asynchronously. This is obviously not a good thing.

The presentation I mentioned is here, if you're interested (it's the first one, about managing buffers).

#4925416 Optimal cDeviceBufferSize parameter for D3DX10CreateSprite

Posted by MJP on 26 March 2012 - 12:08 PM

It's effectively the number of sprites that it will be able to batch into single draw call. So if you only ever draw one sprite then it should be 1, since any more would be a waste of memory. If you ever have more sprites that will use the same texture, then you should set that to the maximum number of times you would use a texture as a sprite during a frame (so that those sprites get batched together).

#4925415 Per-pixel sprite collision detection in DirectX 10

Posted by MJP on 26 March 2012 - 12:03 PM

Do you use the DEBUG flag when creating your device, and do you link to the debug version of D3DX10.lib? If you do that, you will get a more detailed explanation of the failure in the debug output window.

#4925229 Per-pixel sprite collision detection in DirectX 10

Posted by MJP on 25 March 2012 - 06:55 PM

In DX10 you can't read back texture data on the CPU if you want to use it on the GPU. This is because it is really slow to read data out of GPU-accessible memory. In DX9 you could usually do it if you use D3DPOOL_MANAGED, since that pool caused the runtime to keep a seperate version of resources in CPU memory.

What you really want to do is have your sprite texture using D3D10_USAGE_IMMUTABLE so that it's only in GPU memory, and have your collision mask in CPU memory only. If you need to read a texture on CPU to generate your collision mask, or if you have your collision mask pre-generated and you just want to read it into CPU memory, then you should have a separate call to D3DX10CreateTextureFromFile() that specifies D3D10_USAGE_STAGING and D3D10_CPU_ACCESS_READ.

Also for your future reference, the rules for what the CPU and GPU can and can't access are listed here.

#4925228 Shaders and animated sprites

Posted by MJP on 25 March 2012 - 06:47 PM

The DX10 sprite class will actually handle batching and instancing, and also uses shaders (it's pretty much impossible not to use them in DX10 to draw anything). It can batch together multiple sprites as long as they use the same texture, which means different animation frames are ok. I'm not sure what you're doing that requires velocity in the shader, so you'd have to explain that further.

Semantics are used to match vertex shader inputs to the individual elements of a vertex inside of a vertex buffer. When you want to use a vertex buffer with a vertex shader, as part of doing that you need to create an input layout. To do this you supply an array of D3D10_INPUT_ELEMENT_DESC structures, with one for each element in your vertex buffer (with an element being a position, texture coordinate, color, normal, etc.). Part of that element description is a semantic string. The input assembler stage uses that semantic string to match vertex elements to your vertex shader input, which makes sure that you get the data that you want in the shader.

It's possible to generate the correct texture coordinates for a frame of animation in the vertex shader. You would need some info passed in indicating where the frame is on the texture (usually you use an xy offset plus a width and height) so that you could generate the proper texture coordinates. As an example, this is a stripped-down version of the vertex shader I use to render sprites:
cbuffer VSConstants : register (b0)
    float2 TextureSize;
    float2 ViewportSize;
    float4x4 Transform;
    float4 Color;
    float4 SourceRect;

struct VSInput
    float2 Position : POSITION;
    float2 TexCoord : TEXCOORD;
    float4x4 Transform : TRANSFORM;
    float4 Color : COLOR;
    float4 SourceRect : SOURCERECT;

struct VSOutput
    float4 Position : SV_Position;
    float2 TexCoord : TEXCOORD;
    float4 Color : COLOR;

VSOutput SpriteVS(in VSInput input)
    // Scale the quad so that it's texture-sized
    float4 positionSS = float4(input.Position * SourceRect.zw, 0.0f, 1.0f);

    // Apply transforms in screen space
    positionSS = mul(positionSS, Transform);

    // Scale by the viewport size, flip Y, then rescale to device coordinates
    float4 positionDS = positionSS;
    positionDS.xy /= ViewportSize;
    positionDS = positionDS * 2 - 1;
    positionDS.y *= -1;

    // Figure out the texture coordinates
    float2 outTexCoord = input.TexCoord;
    outTexCoord.xy *= SourceRect.zw / TextureSize;
    outTexCoord.xy += SourceRect.xy / TextureSize;

    VSOutput output;
    output.Position = positionDS;
    output.TexCoord = outTexCoord;
    output.Color = input.Color;

    return output;

So in my shader "SourceRect" has the offset in XY and the width/height in ZW, both in texel units. So if you had a 512x256 texture with two frames side-by-side, you would use a SourceRect of (0, 0, 256, 256) to draw the first frame and (256, 0, 256, 256) to draw the second frame. Or like I mentioned earlier you can also use a texture array if you want, in which case you would only need to pass an index to your pixel shader to know which frame to use. However this is a little more advanced.

This shader only works for drawing one sprite at a time. If you wanted to batch lots of sprites using the same texture, you could use instancing and pass the SourceRect + Transform + Color through a second vertex buffer and then have your shader access them as vertex inputs rather than through a constant buffer. Alternatively, you could have an array of such data in your constant buffer and use SV_InstanceID to get the index to use.

#4924951 Loading of 3D Textures from a 2D texture

Posted by MJP on 24 March 2012 - 12:57 PM

You're not going to be able to do it that way...the D3DX texture loader is going to know how to reinterpret your 2D texture as a 3D texture. You'll need to load it as 2D texture first (with D3D11_USAGE_STAGING and D3D11_CPU_ACCESS_READ so that you can read it on the CPU), and then pass the data to CreateTexture3D through the pInitData parameter. As long as your data is in x->y->z byte order, it should work fine as long as you set SysMemPitch and SysMemSlicePitch correctly. So your SysMemPitch should be 32 * formatBytesPerPixel, and the SysMemSlicePitch should be the pitch value that you get when you call Map on your 2D staging texture.