DX11 Can pre-compiled shaders be provided in the Release version of a game?

Recommended Posts

From what the MSDN states, there are two ways of compiling HLSL shaders: either at runtime or "offline" -- using a tool like fxc.exe, for instance

My question is, are there any risks in using pre-compiled shaders in the final game? I mean, is there any situation in which the pre-compiled shaders might not work?

Or ideally shaders should always be compiled when lauching the game?

Share this post

Link to post
Share on other sites

Ideally shaders would always be precompiled.

6 minutes ago, thefoxbard said:

I mean, is there any situation in which the pre-compiled shaders might not work?

Shaders are pre-compiled for a particular target / shader model -- e.g. SM5 shaders for FEATURE_LEVEL_11.

If you try to load these on FEATURE_LEVEL_10, they won't work... So, you have to precompile your shaders once for each target that you wish to support.

e.g. if you want to support D3D10, D3D10.1 and D3D11 hardware, then compile your shaders for SM4, SM4.1 and SM5, and then load the correct set of precompiled shaders at runtime.

Share this post

Link to post
Share on other sites
22 minutes ago, thefoxbard said:

For some reason I thought games compiled shaders at the user's machine before execution.

They do in OpenGL, and you can in D3D if you like. If you had a lot of procedurally generated content then you might prefer it, perhaps...

Share this post

Link to post
Share on other sites

It's instructive in cases such as this to examine how the compilation process actually works.

Under D3D compiling a shader is a two-stage process.  Stage 1 compiles the text-based HLSL code to hardware-independent binary blobs and this is the stage that is carried out by fxc.exe or D3DCompile (which are actually the same thing but I'll get to that in a moment).  Stage 2 always happens in your program and this stage takes the hardware-independent binary blobs and converts them to actual shader objects (i.e. ID3D11VertexShader, etc); this stage is exposed by the ID3D11Device::CreateVertexShader/etc API calls.

The point I made above is that fxc.exe and D3DCompile are actually the same thing; fxc.exe actually calls D3DCompile to compile it's shaders, and this is something you can confirm by using a tool such as e.g. Dependency Walker.  You can also infer that D3DCompile and D3DCompile2 are hardware-independent by the fact that they don't take an ID3D11Device as a parameter (in other words Direct3D doesn't even need to be initialized in order to compile a shader; this is a pure software stage).

Microsoft's own advice on this matter is given at the page titled HLSL, FXC, and D3DCompile and I'll quote:


As we have recommended for many years, developers should compile their HLSL shaders at build-time rather than rely on runtime compilation. There is little benefit to doing runtime compilation for most scenarios, as vendor-specific micro-optimizations are done by the driver at runtime when converting the HLSL binary shader blobs to vendor-specific instructions as part of the shader object creation step. Developers don’t generally want the HLSL compiler results to change ‘in the field’ over time, so it makes more sense to do compilation at build-time. That is the primary usage scenario expected with the Windows 8.0 SDK and Visual Studio 11, and is the only supported scenario for Windows Store apps (a.k.a Metro style apps) in Windows 8.0.

This page then goes on to discuss scenarios in which you may, despite this advice, still wish to compile at runtime, such as for development purposes; runtime compilation is particularly useful for shader development if you're able to quickly reload and recompile your shaders while the program is running, and see the effects of code changes immediately.

Finally, and this may not be immediately obvious, but you can actually compile your shaders for a lower target than the D3D device you end up creating and ship a single set of precompiled shaders that way.  For example, if you set a minimum of D3D10 class hardware, you can compile your shaders for SM4 and set up your feature levels appropriate, and they'll work on D3D_FEATURE_LEVEL_10_0, D3D_FEATURE_LEVEL_10_1, D3D_FEATURE_LEVEL_11_0 or higher.

Share this post

Link to post
Share on other sites

This also means that you actually can't compile to any HW-dependent format with D3D11 on PC at all, it will all happen in the driver at run-time, as stated above. It can cost measurable amounts of time, so in order to reduce spikes during gameplay on PC (when accessing the shader for the first time), "pre-warm" your shader cache at load-time. On consoles, on the other hand, you can't compile in runtime (at all!) and must load everything as final microcode for the target HW. So it makes sense to build offline under all cases + compile online only for rapid development on PC.

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Announcements

  • Forum Statistics

    • Total Topics
    • Total Posts
  • Similar Content

    • By joeblack
      im reading about specular aliasing because of mip maps, as far as i understood it, you need to compute fetched normal lenght and detect now its changed from unit length. I’m currently using BC5 normal maps, so i reconstruct z in shader and therefore my normals are normalized. Can i still somehow use antialiasing or its not needed? Thanks.
    • By 51mon
      I want to change the sampling behaviour to SampleLevel(coord, ddx(coord.y).xx, ddy(coord.y).xx). I was just wondering if it's possible without explicit shader code, e.g. with some flags or so?
    • By GalacticCrew
      I want to improve the performance of my game (engine) and some of your helped me to make a GPU Profiler. After creating the GPU Profiler, I started to measure the time my GPU needs per frame. I refined my GPU time measurements to find my bottleneck.
      Searching the bottleneck
      Rendering a small scene in an Idle state takes around 15.38 ms per frame. 13.54 ms (88.04%) are spent while rendering the scene, 1.57 ms (10.22%) are spent during the SwapChain.Present call (no VSync!) and the rest is spent on other tasks like rendering the UI. I further investigated the scene rendering, since it takes über 88% of my GPU frame rendering time.
      When rendering my scene, most of the time (80.97%) is spent rendering my models. The rest is spent to render the background/skybox, updating animation data, updating pixel shader constant buffer, etc. It wasn't really suprising that most of the time is spent for my models, so I further refined my measurements to find the actual bottleneck.
      In my example scene, I have five animated NPCs. When rendering these NPCs, most actions are almost for free. Setting the proper shaders in the input layout (0.11%), updating vertex shader constant buffers (0.32%), setting textures (0.24%) and setting vertex and index buffers (0.28%). However, the rest of the GPU time (99.05% !!) is spent in two function calls: DrawIndexed and DrawIndexedInstance.
      I searched this forum and the web for other articles and threads about these functions, but I haven't found a lot of useful information. I use SharpDX and .NET Framework 4.5 to develop my game (engine). The developer of SharpDX said, that "The method DrawIndexed in SharpDX is a direct call to DirectX" (Source). DirectX 11 is widely used and SharpDX is "only" a wrapper for DirectX functions, I assume the problem is in my code.
      How I render my scene
      When rendering my scene, I render one model after another. Each model has one or more parts and one or more positions. For example, a human model has parts like head, hands, legs, torso, etc. and may be placed in different locations (on the couch, on a street, ...). For static elements like furniture, houses, etc. I use instancing, because the positions never change at run-time. Dynamic models like humans and monster don't use instancing, because positions change over time.
      When rendering a model, I use this work-flow:
      Set vertex and pixel shaders, if they need to be updated (e.g. PBR shaders, simple shader, depth info shaders, ...) Set animation data as constant buffer in the vertex shader, if the model is animated Set generic vertex shader constant buffer (world matrix, etc.) Render all parts of the model. For each part: Set diffuse, normal, specular and emissive texture shader views Set vertex buffer Set index buffer Call DrawIndexedInstanced for instanced models and DrawIndexed models What's the problem
      After my GPU profiling, I know that over 99% of the rendering time for a single model is spent in the DrawIndexedInstanced and DrawIndexed function calls. But why do they take so long? Do I have to try to optimize my vertex or pixel shaders? I do not use other types of shaders at the moment. "Le Comte du Merde-fou" suggested in this post to merge regions of vertices to larger vertex buffers to reduce the number of Draw calls. While this makes sense to me, it does not explain why rendering my five (!) animated models takes that much GPU time. To make sure I don't analyse something I wrong, I made sure to not use the D3D11_CREATE_DEVICE_DEBUG flag and to run as Release version in Visual Studio as suggested by Hodgman in this forum thread.
      My engine does its job. Multi-texturing, animation, soft shadowing, instancing, etc. are all implemented, but I need to reduce the GPU load for performance reasons. Each frame takes less than 3ms CPU time by the way. So the problem is on the GPU side, I believe.
    • By noodleBowl
      I was wondering if someone could explain this to me
      I'm working on using the windows WIC apis to load in textures for DirectX 11. I see that sometimes the WIC Pixel Formats do not directly match a DXGI Format that is used in DirectX. I see that in cases like this the original WIC Pixel Format is converted into a WIC Pixel Format that does directly match a DXGI Format. And doing this conversion is easy, but I do not understand the reason behind 2 of the WIC Pixel Formats that are converted based on Microsoft's guide
      I was wondering if someone could tell me why Microsoft's guide on this topic says that GUID_WICPixelFormat40bppCMYKAlpha should be converted into GUID_WICPixelFormat64bppRGBA and why GUID_WICPixelFormat80bppCMYKAlpha should be converted into GUID_WICPixelFormat64bppRGBA
      In one case I would think that: 
      GUID_WICPixelFormat40bppCMYKAlpha would convert to GUID_WICPixelFormat32bppRGBA and that GUID_WICPixelFormat80bppCMYKAlpha would convert to GUID_WICPixelFormat64bppRGBA, because the black channel (k) values would get readded / "swallowed" into into the CMY channels
      In the second case I would think that:
      GUID_WICPixelFormat40bppCMYKAlpha would convert to GUID_WICPixelFormat64bppRGBA and that GUID_WICPixelFormat80bppCMYKAlpha would convert to GUID_WICPixelFormat128bppRGBA, because the black channel (k) bits would get redistributed amongst the remaining 4 channels (CYMA) and those "new bits" added to those channels would fit in the GUID_WICPixelFormat64bppRGBA and GUID_WICPixelFormat128bppRGBA formats. But also seeing as there is no GUID_WICPixelFormat128bppRGBA format this case is kind of null and void
      I basically do not understand why Microsoft says GUID_WICPixelFormat40bppCMYKAlpha and GUID_WICPixelFormat80bppCMYKAlpha should convert to GUID_WICPixelFormat64bppRGBA in the end
    • By DejayHextrix
      Hi, New here. 
      I need some help. My fiance and I like to play this mobile game online that goes by real time. Her and I are always working but when we have free time we like to play this game. We don't always got time throughout the day to Queue Buildings, troops, Upgrades....etc.... 
      I was told to look into DLL Injection and OpenGL/DirectX Hooking. Is this true? Is this what I need to learn? 
      How do I read the Android files, or modify the files, or get the in-game tags/variables for the game I want? 
      Any assistance on this would be most appreciated. I been everywhere and seems no one knows or is to lazy to help me out. It would be nice to have assistance for once. I don't know what I need to learn. 
      So links of topics I need to learn within the comment section would be SOOOOO.....Helpful. Anything to just get me started. 
      Dejay Hextrix 
  • Popular Now