#1 Members - Reputation: 184
Posted 02 May 2012 - 05:25 AM
Given the fact that I have no option to pre-compile my shaders in the game I'm modding because it's always compiling the HLSL source files on every start-up (and again whenever you load a savegame) - what options would I have to speed up the game's just-in-time compilation?
I mean, my shaders are getting fairly complex and by now they take quite some time to compile on every game startup, which kind of sucks. I know I could try to clean up my files, remove comments, empty lines, unused functions and variables, even could try to simplify functions and control flow in order to make compilation simpler and more straight-forward (e.g. prevent some branching). But I'm afraid that wouldn't help too much simply due to the complexity of things I need my shaders to do, so the large effort that would take wouldn't fit the resulting improvement.
So, are there other ways to make compilation faster? For example any hacked / tweaked DX9 binaries or "injected" optimized compilers that are executed automatically when installed instead of the originals when the game addresses the DirectX engine to do the compilation? Or, some way to nest shader assembly code inside a HLSL file (pixel or vertex shader source) with I could generate offline with the fxa compiler prior to letting the game compile the rest of the shader sources? Or any other approaches which I might not have been thought of yet?
Any thoughts would be appreciated.
#2 Members - Reputation: 288
Posted 02 May 2012 - 07:53 AM
- manually unroll loops (works better (in terms of compilation time) than using [unroll], [fastopt] or whatever compiler hints)
- especially true for nested loops!
- the deeper the called function, the worse
- look for redundant texture sampling which could be pulled up from loops or functions - you'll get cache hit, however it will compile longer
What doesn't help (neither compilation speed nor performance):
- trying to manually optimise ALU operations
I guess most of this will be true for DX9, too.
Edited by pcmaster, 02 May 2012 - 07:54 AM.
#3 Members - Reputation: 184
Posted 02 May 2012 - 08:49 AM
What about arithmetic stuff? I'm having the expression that sometimes when I simply add another formula with one or two additional variables the compilation slows down significantly. Are there, besides texture lookups, some operations that are rather expensive and might be avoided by other expressions doing the same job?
And what about using pre-compiled parts in source files? Is there a way to do this (I remember old Pascal times where I had to place assembly code within Pascal code in order to speed up graphical processing)?
#4 Moderators - Reputation: 5424
Posted 02 May 2012 - 12:17 PM
There's no way to inline assembly in HLSL shaders. You can manually assemble a shader from chunks of pre-compiled assembly if you'd like, but that's about it.
#6 Members - Reputation: 184
Posted 06 May 2012 - 12:28 PM
So is there any way to speed up texture sampling / lookups specifically? And why does this type of operation take such amounts of time to compile?
#9 Members - Reputation: 184
Posted 09 May 2012 - 02:01 AM
#10 Members - Reputation: 518
Posted 12 May 2012 - 05:03 PM
Just by curiosity, why can't you ship precompiled shaders?Given the fact that I have no option to pre-compile my shaders in the game I'm modding because it's always compiling the HLSL source files on every start-up (and again whenever you load a savegame) - what options would I have to speed up the game's just-in-time compilation?.
As it has been said nested loops with texture sampling can be damn slow, as the compiler is trying to calculate correct ddx/ddy. If you can calculate them yourself or you can afford to sample a single mipmap the shader will compile more quickly.
#11 Members - Reputation: 184
Posted 16 May 2012 - 02:41 AM
Just by curiosity, why can't you ship precompiled shaders?
Simply because the modded game (STALKER) won't support it. The game expects all shaders to be located in a specific sub-folder in pure HLSL source code. Everything else won't work. But please don't ask me why the dev's decided not to allow compilded shaders...
As it has been said nested loops with texture sampling can be damn slow, as the compiler is trying to calculate correct ddx/ddy. If you can calculate them yourself or you can afford to sample a single mipmap the shader will compile more quickly.
That sounds interesting and might be helpful. I guess I'd need to use an other method than tex2D, so which one would be required here? And how would I sample a single mipmap?
#12 Members - Reputation: 442
Posted 16 May 2012 - 03:10 AM
http://mynameismjp.wordpress.com/2012/04/13/a-quick-note-on-shader-compilers/
He managed to reduce compile time for a compute shader from 10 minutes to 45 seconds!
#13 Members - Reputation: 184
Posted 16 May 2012 - 04:50 AM
This blog post by MJP might come in handy
http://mynameismjp.w...ader-compilers/
He managed to reduce compile time for a compute shader from 10 minutes to 45 seconds!
Interesting, thanks. However since I don't have an option to change the compiler in use I'll have to tweak my code in order to compile faster with the built-in compiler.
#14 Members - Reputation: 288
Posted 16 May 2012 - 07:35 AM
What sucks is that they're dropping D3DX from SDK 8 :-( That means a lot of rewrite :-(
#15 Moderators - Reputation: 5424
Posted 16 May 2012 - 12:38 PM
Big thanks, the new Windows 8 SDK fxc.exe indeed compiles MUCH faster (130 seconds vs 15 seconds!!!). I just hope the compilated fxo will continue working with existing drivers and Windows 7 / DirectX 11 SDKs and runtimes...
It works fine, they didn't change the shader binary format at all.
#16 Members - Reputation: 184
Posted 17 May 2012 - 12:29 PM
But may we come back to topic please:
As it has been said nested loops with texture sampling can be damn slow, as the compiler is trying to calculate correct ddx/ddy. If you can calculate them yourself or you can afford to sample a single mipmap the shader will compile more quickly.
That sounds interesting and might be helpful. I guess I'd need to use an other method than tex2D, so which one would be required here? And how would I sample a single mipmap?
So, any suggestions here? Thanks.
#17 Moderators - Reputation: 5424
Posted 17 May 2012 - 02:25 PM
You can specify the mipmap level explicitly with tex2Dlod. You can also use tex2Dgrad if you want to specify the UV gradients instead of a mip level.
#19 Members - Reputation: 288
Posted 18 May 2012 - 06:58 AM
I wonder what went bad in their prior version that it's so insanely slow...
#20 Members - Reputation: 184
Posted 18 May 2012 - 07:06 AM
In later versions of the SDK the shader compiler is entirely hosted in D3DCompile_xx.dll. That DLL is then used by fxc.exe and the D3DX functions, or it can be used directly.
So does that mean that I could simply replace the existing D3DCompile_xx.dll in the system32 and/or SysWOW64 folder with the one from the new Win8 SDK in order to speed up shader compilation in any application that doesn't use it's own directX binaries?






