So in some places i would like to avoid switching shaders and i see that this static bool (constant for shader execution but different on per drawcall basis) and is costing 1 instruction (it says that in docs), is it cheaper then switching shader and is it cheaper then dynamic branching?
Everything to do with performance characteristics is implementation defined, so you'll have to profile to get answers ;) but...
There's two main implementation options the driver could use:
1) It internally performs a shader switch for you. Basically, it takes your supplied shader code, finds all the permutations based on the static branches, and internally creates one compiled shader program for each permutation. Before each draw call, it checks the values of the 16 booleans to pick the appropriate shader code.
In this case, it's the same as if you implemented your own shader permutation system. Switching shaders is basically free, as long as the previous draw-call covered a few hundred pixels.
2) It leaves the branch in there, performing it per-pixel.
In this case, you're probably going to burn a bunch of cycles per pixel in exchange for the convenience of not having to switch shaders. It will likely be faster than a dynamic branch (e.g. branching on the results of some float computations) by a good amount -- e.g. if a dynamic branch instruction takes a dozen cycles to complete, a static branch instruction might take half a dozen cycles...
I just checked, but saturate as a modifier is only available from shader model 4, while I am using shader model 3. According to documentation: http://msdn.microsoft.com/en-us/library/windows/desktop/hh447231(v=vs.85).aspx
The saturate function in HLSL has been around since shader model 1.
That's really weird, because if I look at the asm output that I'm getting from my compiled SM3 code, it does include instructions like mul_sat (which aren't listed on the MSDN instruction reference for SM3...).
The MSDN also shows that the _sat modifier did exist in SM1...
[edit] The ps_2/ps_3 modifiers are documented here (and for vs_3 here). Mystery solved. That page that says that the modifier is only available in SM4+ is just wrong :/