Speed up shader compilation (HLSL)

Started by
33 comments, last by Meltac 11 years, 10 months ago
Hi all!

Given the fact that I have no option to pre-compile my shaders in the game I'm modding because it's always compiling the HLSL source files on every start-up (and again whenever you load a savegame) - what options would I have to speed up the game's just-in-time compilation?

I mean, my shaders are getting fairly complex and by now they take quite some time to compile on every game startup, which kind of sucks. I know I could try to clean up my files, remove comments, empty lines, unused functions and variables, even could try to simplify functions and control flow in order to make compilation simpler and more straight-forward (e.g. prevent some branching). But I'm afraid that wouldn't help too much simply due to the complexity of things I need my shaders to do, so the large effort that would take wouldn't fit the resulting improvement.

So, are there other ways to make compilation faster? For example any hacked / tweaked DX9 binaries or "injected" optimized compilers that are executed automatically when installed instead of the originals when the game addresses the DirectX engine to do the compilation? Or, some way to nest shader assembly code inside a HLSL file (pixel or vertex shader source) with I could generate offline with the fxa compiler prior to letting the game compile the rest of the shader sources? Or any other approaches which I might not have been thought of yet?

Any thoughts would be appreciated.
Advertisement
You won't speed anything up by removing comments, dead code, useless vars and such, since the lexical/syntactic analysis isn't slow. What's slow is register allocation and I'm afraid there's usually not much you can do. We're getting into 15-30 minute compile times with our complex DX11 shaders (and we have hundreds) and what sometimes helps (with compile time):
- manually unroll loops (works better (in terms of compilation time) than using [unroll], [fastopt] or whatever compiler hints)
- especially true for nested loops!
- the deeper the called function, the worse
- look for redundant texture sampling which could be pulled up from loops or functions - you'll get cache hit, however it will compile longer

What doesn't help (neither compilation speed nor performance):
- trying to manually optimise ALU operations

I guess most of this will be true for DX9, too.
Thanks, this is already helpful. I always had the expression that much texture sampling causes long compilation times, although I don't know why that is.

What about arithmetic stuff? I'm having the expression that sometimes when I simply add another formula with one or two additional variables the compilation slows down significantly. Are there, besides texture lookups, some operations that are rather expensive and might be avoided by other expressions doing the same job?

And what about using pre-compiled parts in source files? Is there a way to do this (I remember old Pascal times where I had to place assembly code within Pascal code in order to speed up graphical processing)?
Loop unrolling is usually the big performance killer. You can usually avoid it using the [loop] attribute. Aside from that multithreading helps if you've got multiple cores to play with, it's pretty trivial to compile 1 shader per thread.

There's no way to inline assembly in HLSL shaders. You can manually assemble a shader from chunks of pre-compiled assembly if you'd like, but that's about it.
Thank you. Unfortunately the compiler being used doesn't support HLSL attributes like [loop], so I fear I'll have to unroll my loops manually in order to prevent the compiler from doing this.
As it turned out, most of the compilation time was caused by the combination of nested function calls and texture sampling calls. The former I've been able to reduce massively, but the latter gives my kinda head-ache due to the fact that I need a large amount of texture samples in order to achieve the image quality I'd like to have.

So is there any way to speed up texture sampling / lookups specifically? And why does this type of operation take such amounts of time to compile?
Anybody?
I'm afraid there's no way :-) How long compilation times are we talking here?
Worst case I encountered so far is over 3 minutes, whereas the original (un-modded) shaders take less than 5 seconds. Longer compilation times than that (due to yet more complex shader code) cause the game to crash upon start-up. And that duration is caused by one single function within one single shader file.

Given the fact that I have no option to pre-compile my shaders in the game I'm modding because it's always compiling the HLSL source files on every start-up (and again whenever you load a savegame) - what options would I have to speed up the game's just-in-time compilation?.

Just by curiosity, why can't you ship precompiled shaders?
As it has been said nested loops with texture sampling can be damn slow, as the compiler is trying to calculate correct ddx/ddy. If you can calculate them yourself or you can afford to sample a single mipmap the shader will compile more quickly.

This topic is closed to new replies.

Advertisement