Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 29 Mar 2007
Offline Last Active Yesterday, 10:21 PM

#5166606 Memory alignment problem (CPU and GPU)

Posted by MJP on 13 July 2014 - 12:54 PM

You seem to have 2 problems:


1. You specify that your positions and normals are using a 16-bit float format, but your vertex struct contains 32-bit floats. You should use DXGI_FORMAT_R32G32B32A32_FLOAT, since that corresponds to the XMFLOAT4 type that you're using.

2. Your input layout has a "COLOR" element, but this element is not present in your VertexInfo struct.

#5166494 Recompile a Shader

Posted by MJP on 12 July 2014 - 06:01 PM

The old D3DX helper library could assemble D3D9 shaders with D3DXAssembleShader. D3D10 and higher don't support assembling shaders, you have to compile from HLSL.

#5166310 hyphotetical raw gpu programming

Posted by MJP on 11 July 2014 - 04:26 PM

There aren't any tools for directly generating and running the raw ISA of a GPU. GPU's are intended be used through drivers and 3D graphics or GPGPU API's, which abstract away a lot of the specifics of the hardware. AMD publicly documents the shader ISA, register set, and command buffer format of their GPU's. With this information you technically have enough information to build your own driver layer for setting up a GPU's registers and memory, issuing commands, and running your raw shader code. However this would be incredibly difficult in practice, and would require a lot of general familiarity with GPU's, your operating system, and the specific GPU that you're targeting. And of course by the time you've finished their might be new GPU's on the market with different ISA's and registers. 

#5166308 Mipmaps not used with multiplied texture coordinates

Posted by MJP on 11 July 2014 - 04:19 PM

So assuming all of your shader code is correct, then there's really 3 possibilities for why it's not working correctly:


1. The value in the constant buffer is somehow wrong or not getting bound to your shader correctly due to a bug in your code

2. The HLSL shader compiler is producing incorrect or unintended code that doesn't use the right constant buffer value

3. There's a bug in the driver causing incorrect behavior


#2 and #3 are pretty easy to verify. For #2 you can just compile your shader with fxc, and look at the generated assembly code to verify that it's doing what you expect. If you're not familiar with shader assembly, you can post the result here and I can help you decipher it. For #3, you can validate the driver behavior by running your program using the reference device. Beware that it's very slow, but will always produce correct results according to the D3D spec.

#5166090 DXT for terrain normals?

Posted by MJP on 10 July 2014 - 04:49 PM

The old trick for DXT5 (AKA BC3) is to stuff the X and Y into the A and G channels of the texture, and then reconstruct Z. If your hardware supports it, BC5 (AKA 3Dc) is a much better option (it's basically 2 DXT5 alpha channels put together). BC7 can also work for normals, but that requires even newer hardware and API's.

#5166089 Scaling on available memory

Posted by MJP on 10 July 2014 - 04:44 PM

I've never personally shipped a PC game, so I don't really have first-hand experience with this. For our current title we have a streaming system where the majority of our textures get streamed in based on proximity and visibility, and then get placed into a pool of fixed-size memory. With our setup it's easy to know if we can fit a texture in the pool before streaming it in, so we can drop the higher-res mips if they won't fit. However this isn't really something you want to do in a shipping game, since it means you might unpredictably get quality loss on key textures depending on how they play through the game and what hardware they have. We really just do it so that we don't crash during development, so that we can fix the content for cases where we're over-budget. Obviously this is a lot easier to do when you're targeting a console with a fixed memory spec.

Based on what I've seen from playing games and reading reviews, I would suspect that most games don't bother trying to adaptively scale back textures and instead just rely on the user adjusting settings. In D3D you can over-commit memory, in which case the OS just starts paging your GPU memory in and out. So your performance starts to drop off a cliff, but you don't crash. Hence you get benchmarks where at certain settings a 2GB video card will get significant better performance than 1GB video card with the same GPU core.

#5165941 [HLSL] Copy content of Consume Buffer to Structured Buffer

Posted by MJP on 09 July 2014 - 10:50 PM

A consume buffer is just a special case of a structured buffer. If you want to read values from the buffer instead of consuming, just create a shader resource view for your buffer and bind it to your context. You can then just access that shader resource view as a StructuredBuffer in your shader.

#5165871 Precompiled shaders

Posted by MJP on 09 July 2014 - 12:25 PM

Which makes me wonder how Microsoft have got it to work at all. The hardware is no different from the chipsets Microsoft are running on. I sometimes wonder if HLSL compiles to pseudocode which gets 'recompiled' when loaded. Don't know, would be interesting to know.


Like Alessio mentioned, the HLSL compiler produces hardware-agnostic shader assembly as output. It's basically an intermediate bytecode format, and it's JIT compiled by the driver at runtime into the native microcode ISA used by the GPU. It's rather similar to the process used by Java and .NET to generate runtime code. Compared to OpenGL it has a few advantages, namely that all of the language parsing and semantics is done through a unified front-end rather than having different implementations per-driver. It also can do a majority of optimizations (inlining, constant folding, dead-code stripping, etc.), which is nice since you can do that offline instead of having to do it at runtime. The downside is that the compiler can only target the virtual ISA, and the JIT compiler that produces the actual bytecode won't have full knowledge of the original code structure when performing optimizations.

FYI Nvidia also has PTX which serves a similar role for CUDA, and AMD has IL.

#5165642 DX11 Multithreaded Resource Creation Question

Posted by MJP on 08 July 2014 - 02:50 PM

In theory everything should work fine, since the runtime itself will synchronize to prevent the driver from having concurrent resource creates if it doesn't support it. But I've never tested on a driver that didn't support concurrent resource creation, so I can't verify anything.

#5165432 How can i have a softer Skin using BRDF

Posted by MJP on 07 July 2014 - 08:16 PM

Yeah we always use the same map, we don't animate it.

#5165119 How can i have a softer Skin using BRDF

Posted by MJP on 06 July 2014 - 03:41 PM

In my experience you really need to use curvature maps for the pre-integrated shading. The trick with using derivatives and clever and might work for certain simple scenarios, but the fact it has discontinuities across triangle edges is pretty much a deal-breaker for faces.


#5165117 Getting started with D3D in C#: InvalidCallException

Posted by MJP on 06 July 2014 - 03:36 PM

In D3D9 you can have a "Lost Device" scenario, and you have to recover from it by calling Reset on your device. It's a real pain to deal with in a robust way, and is the source of a lot of errors. Fortunately they fixed it for D3D10/D3D11.

I should definitely warn you that the "Managed DirectX" library you're using is very old, and hasn't been updated in many years. Microsoft essentially abandoned it, and left the task of a managed DirectX wrapper up to the community. Earlier on SlimDX was the most popular Managed DX wrapper, but these days SharpDX is much more active and up-to-date with the latest DirectX versions. I would recommend using SharpDX instead if you're looking to actually do any DX app development, and I would also recommend using D3D11 unless you really need to target Windows XP.


Posted by MJP on 29 June 2014 - 04:06 PM

You can always create your vertex buffer as a structured buffer or byte address buffer, and then us SV_VertexID to manually unpack your data in the shader. However it's almost certainly not going to save you any performance.

#5163128 DXT5 artifacts

Posted by MJP on 26 June 2014 - 06:37 PM

With BC5 the X and Y components will be in the red and green channels of the texture, so you would need to change your shader code to use .xy instead of .wy.

There are lots of ways you can verify the format of the texture that's created: you could step into the loader code and see what it does, you could call GetDesc on the ID3D11Texture2D to see what format it's using, or you could use a tool like VS Graphics Debugger or Renderdoc to inspect the texture.

DirectXTex has functions for compressing textures, which you could incorporate into a custom tool. It also comes with the texconv sample, which is a command-line tool that you can use for compressing files and generating mips.

#5163127 Getting around non-connected vertex gaps in hardware tessellation displacemen...

Posted by MJP on 26 June 2014 - 06:33 PM

It's a common problem, and it's pretty hard to work around. You should read through this presentation for some ideas.