• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
Meltac

Speed up shader compilation (HLSL)

34 posts in this topic

Hi all!

Given the fact that I have no option to pre-compile my shaders in the game I'm modding because it's always compiling the HLSL source files on every start-up (and again whenever you load a savegame) - what options would I have to speed up the game's just-in-time compilation?

I mean, my shaders are getting fairly complex and by now they take quite some time to compile on every game startup, which kind of sucks. I know I could try to clean up my files, remove comments, empty lines, unused functions and variables, even could try to simplify functions and control flow in order to make compilation simpler and more straight-forward (e.g. prevent some branching). But I'm afraid that wouldn't help too much simply due to the complexity of things I need my shaders to do, so the large effort that would take wouldn't fit the resulting improvement.

So, are there other ways to make compilation faster? For example any hacked / tweaked DX9 binaries or "injected" optimized compilers that are executed automatically when installed instead of the originals when the game addresses the DirectX engine to do the compilation? Or, some way to nest shader assembly code inside a HLSL file (pixel or vertex shader source) with I could generate offline with the fxa compiler prior to letting the game compile the rest of the shader sources? Or any other approaches which I might not have been thought of yet?

Any thoughts would be appreciated.
0

Share this post


Link to post
Share on other sites
You won't speed anything up by removing comments, dead code, useless vars and such, since the lexical/syntactic analysis isn't slow. What's slow is register allocation and I'm afraid there's usually not much you can do. We're getting into 15-30 minute compile times with our complex DX11 shaders (and we have hundreds) and what sometimes helps (with compile time):
- manually unroll loops (works better (in terms of compilation time) than using [unroll], [fastopt] or whatever compiler hints)
- especially true for nested loops!
- the deeper the called function, the worse
- look for redundant texture sampling which could be pulled up from loops or functions - you'll get cache hit, however it will compile longer

What doesn't help (neither compilation speed nor performance):
- trying to manually optimise ALU operations

I guess most of this will be true for DX9, too. Edited by pcmaster
1

Share this post


Link to post
Share on other sites
Thanks, this is already helpful. I always had the expression that much texture sampling causes long compilation times, although I don't know why that is.

What about arithmetic stuff? I'm having the expression that sometimes when I simply add another formula with one or two additional variables the compilation slows down significantly. Are there, besides texture lookups, some operations that are rather expensive and might be avoided by other expressions doing the same job?

And what about using pre-compiled parts in source files? Is there a way to do this (I remember old Pascal times where I had to place assembly code within Pascal code in order to speed up graphical processing)?
0

Share this post


Link to post
Share on other sites
Loop unrolling is usually the big performance killer. You can usually avoid it using the [loop] attribute. Aside from that multithreading helps if you've got multiple cores to play with, it's pretty trivial to compile 1 shader per thread.

There's no way to inline assembly in HLSL shaders. You can manually assemble a shader from chunks of pre-compiled assembly if you'd like, but that's about it.
1

Share this post


Link to post
Share on other sites
Thank you. Unfortunately the compiler being used doesn't support HLSL attributes like [loop], so I fear I'll have to unroll my loops manually in order to prevent the compiler from doing this.
0

Share this post


Link to post
Share on other sites
As it turned out, most of the compilation time was caused by the combination of nested function calls and texture sampling calls. The former I've been able to reduce massively, but the latter gives my kinda head-ache due to the fact that I need a large amount of texture samples in order to achieve the image quality I'd like to have.

So is there any way to speed up texture sampling / lookups specifically? And why does this type of operation take such amounts of time to compile?
0

Share this post


Link to post
Share on other sites
Worst case I encountered so far is over 3 minutes, whereas the original (un-modded) shaders take less than 5 seconds. Longer compilation times than that (due to yet more complex shader code) cause the game to crash upon start-up. And that duration is caused by one single function within one single shader file.
0

Share this post


Link to post
Share on other sites
[quote name='Meltac' timestamp='1335957915' post='4936728']
Given the fact that I have no option to pre-compile my shaders in the game I'm modding because it's always compiling the HLSL source files on every start-up (and again whenever you load a savegame) - what options would I have to speed up the game's just-in-time compilation?.
[/quote]
Just by curiosity, why can't you ship precompiled shaders?
As it has been said nested loops with texture sampling can be damn slow, as the compiler is trying to calculate correct ddx/ddy. If you can calculate them yourself or you can afford to sample a single mipmap the shader will compile more quickly.
1

Share this post


Link to post
Share on other sites
[quote name='xoofx' timestamp='1336863824' post='4939653']
Just by curiosity, why can't you ship precompiled shaders?
[/quote]

Simply because the modded game (STALKER) won't support it. The game expects all shaders to be located in a specific sub-folder in pure HLSL source code. Everything else won't work. But please don't ask me why the dev's decided not to allow compilded shaders...

[quote name='xoofx' timestamp='1336863824' post='4939653']
As it has been said nested loops with texture sampling can be damn slow, as the compiler is trying to calculate correct ddx/ddy. If you can calculate them yourself or you can afford to sample a single mipmap the shader will compile more quickly.
[/quote]

That sounds interesting and might be helpful. I guess I'd need to use an other method than tex2D, so which one would be required here? And how would I sample a single mipmap?
0

Share this post


Link to post
Share on other sites
This blog post by MJP might come in handy

[url="http://mynameismjp.wordpress.com/2012/04/13/a-quick-note-on-shader-compilers/"]http://mynameismjp.wordpress.com/2012/04/13/a-quick-note-on-shader-compilers/[/url]

He managed to reduce compile time for a compute shader from 10 minutes to 45 seconds!
0

Share this post


Link to post
Share on other sites
[quote name='hupsilardee' timestamp='1337159454' post='4940629']
This blog post by MJP might come in handy

[url="http://mynameismjp.wordpress.com/2012/04/13/a-quick-note-on-shader-compilers/"]http://mynameismjp.w...ader-compilers/[/url]

He managed to reduce compile time for a compute shader from 10 minutes to 45 seconds!
[/quote]

Interesting, thanks. However since I don't have an option to change the compiler in use I'll have to tweak my code in order to compile faster with the built-in compiler.
0

Share this post


Link to post
Share on other sites
Big thanks, the new Windows 8 SDK fxc.exe indeed compiles MUCH faster (130 seconds vs 15 seconds!!!). I just hope the compilated fxo will continue working with existing drivers and Windows 7 / DirectX 11 SDKs and runtimes...

What sucks is that they're dropping D3DX from SDK 8 :-( That means a lot of rewrite :-(
0

Share this post


Link to post
Share on other sites
[quote name='pcmaster' timestamp='1337175348' post='4940663']
Big thanks, the new Windows 8 SDK fxc.exe indeed compiles MUCH faster (130 seconds vs 15 seconds!!!). I just hope the compilated fxo will continue working with existing drivers and Windows 7 / DirectX 11 SDKs and runtimes...
[/quote]

It works fine, they didn't change the shader binary format at all.
0

Share this post


Link to post
Share on other sites
Just out of curiosity, by "compiler", do you mean fxc.exe only, or also the compiler methods nested in D3DX9_XX.dll or maybe other DLLs as well?

But may we come back to topic please:

[quote name='Meltac' timestamp='1337157662' post='4940625']
As it has been said nested loops with texture sampling can be damn slow, as the compiler is trying to calculate correct ddx/ddy. If you can calculate them yourself or you can afford to sample a single mipmap the shader will compile more quickly.

That sounds interesting and might be helpful. I guess I'd need to use an other method than tex2D, so which one would be required here? And how would I sample a single mipmap?
[/quote]

So, any suggestions here? Thanks.
0

Share this post


Link to post
Share on other sites
In later versions of the SDK the shader compiler is entirely hosted in D3DCompile_xx.dll. That DLL is then used by fxc.exe and the D3DX functions, or it can be used directly.

You can specify the mipmap level explicitly with tex2Dlod. You can also use tex2Dgrad if you want to specify the UV gradients instead of a mip level.
1

Share this post


Link to post
Share on other sites
I have to thank MJP again for trying this out. We were quite frustrated by compilation times of our shaders and would check Microsoft webs for a new SDK version from time to time, however we wouldn't go into Win 8 SDK yet, as we'd miss D3DX a lot (and don't care about Win 8 at all, yet :D). The possibility to use the new, fixed fxc alone while compiling and linking against June 2010 SDK is just awesome and saves us a lot of time and nerves now :-)

I wonder what went bad in their prior version that it's so insanely slow...
0

Share this post


Link to post
Share on other sites
[quote name='MJP' timestamp='1337286358' post='4941031']
In later versions of the SDK the shader compiler is entirely hosted in D3DCompile_xx.dll. That DLL is then used by fxc.exe and the D3DX functions, or it can be used directly.
[/quote]

So does that mean that I could simply replace the existing D3DCompile_xx.dll in the system32 and/or SysWOW64 folder with the one from the new Win8 SDK in order to speed up shader compilation in any application that doesn't use it's own directX binaries?
0

Share this post


Link to post
Share on other sites
I tried replacing the used D3DCompiler_43.dll with the Win 8 SDK's d3dcompiler_44.dll (renamed it of course), but it didn't speed up anything at all.

Are there maybe any other files that I need to copy over in order to get the new compiler running without fxc.exe? And does the new compiler even speed up DX9 shaders?

EDIT:
Another failed test: I thought I should be able to use at least the attributes such as [loop] with the new compiler, but to my surprise it still doesn't recognize them, complaining about the leading bracket as it always did. How could that be? Is there some compiler directive or switch that must be activated to enable HLSL attributes?

And yes, I confirmed that I replaced the right compiler file which is definitely used by the game. Edited by Meltac
0

Share this post


Link to post
Share on other sites
They are not probably neither using d3dcompiler nor a recent version of d3dcompiler (but d3dx9 functions), that's why loop attributes are not recognized. Though, loop attributes won't help either. Did you check on your shader that if you comment all your texture fetches, it compiles faster?
0

Share this post


Link to post
Share on other sites
[quote name='xoofx' timestamp='1337550278' post='4941735']
They are not probably neither using d3dcompiler nor a recent version of d3dcompiler (but d3dx9 functions), that's why loop attributes are not recognized. Though, loop attributes won't help either. Did you check on your shader that if you comment all your texture fetches, it compiles faster?
[/quote]

Thanks. They ARE using one specific version of d3dcompiler because the game crashes upon start-up if I deleted d3dcompiler_43.dll from SysWOW64 directory. So my guess was if I replaced that file with a newer version (e.g. the one from the Win8 SDK) I should benefit from the newer compiler features such as attributes. Doesn't seem to be the case, though.

Anyway, the attributes would have been nice to have, but what I'm really after is the speed increase that the new compiler is said to provide. This especially because, as you say, textures fetches/samplings slow down compilation significantly in my case, but are somewhat difficult to avoid in my scenario. So, it would be great to have that super-fast Win8 compiler running, instead of spending vast amounts of time trying to make some tiny optimizations that might not even speed up compilation enough to allow me to ship my shaders.
0

Share this post


Link to post
Share on other sites
This is a long shot, but what about doing a "hybrid" precompiled version?
Here's an example from someone that's been doing that for 3DS Max for a long time: http://www.poopinmymouth.com/3d/sdk/agusturinn-shader.zip
It basically means you replace the actual vertex- and pixelshaders with compiled asm code, and then feed the variables back in as PixelShaderConstants and the like. (3DS Max requires this or you can't interface with it)

I'm looking into this as well as i've got a 10-minute compile in 3DS Max, while the (incomplete) hybrid precompiled version loads instantly.

It's a bit annoying to generate, I'm trying to contact the author of that example to see if he has a better way than manually doing it (which is a bunch of mindless copy paste work).
0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0