Jump to content
  • Advertisement
Sign in to follow this  
Oogst

Boolean operations in shader assembly

This topic is 1817 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

What is the best way to do boolean operations in shader assembly for shader model 3.0 DirectX? So far the only way I have seen is smartly using floats to represent them, is there something easier/more efficient/more direct?
 
For example, if I want to do something like this:
 
if (a < b && (c < d || e < f))
{
    ...
}

I could break this down to roughly this (slightly simplified from assembly):
 
if_lt a, b
{
    sub t, c, d
    cmp t, t, 0, 1
    sub s, e, f
    cmp s, s, 0, 1
    add t, t, s
    if_gt t, 0.5
    {
        ...
    }
}

Is there some better way to do this, or is this just how I am supposed to implement this?
 
(By the way, the reason I have recently started learning shader assembly is that I have a shader for which both Cg and HLSL require too many temporaries to run, while I know it can be done with less.) Edited by Oogst

Share this post


Link to post
Share on other sites
Advertisement

So far the only way I have seen is smartly using floats to represent them, is there something easier/more efficient/more direct?

If they're dynamic branches (decisions that have to be made per-pixel/vertex), then yeah, you have to use float because there are no integer/bool registers in SM3.

 

The best way to deal with branching is to avoid doing branching laugh.png cool.png

 

In your example, you end up using two different branches (one nested in the other). It would likely be much better to just use a single branch, due to how expensive they are on SM3 hardware...

 


By the way, the reason I have recently started learning shader assembly is that I have a shader for which both Cg and HLSL require too many temporaries to run, while I know it can be done with less

Does the compiler fail due to this, or the runtime?

 

Usually you can massage your HLSL code to produce better asm, rather then writing the asm yourself (e.g. by using the HLSL attributes such as branch, flatten, fastopt, forcecase, call, unroll, loop, isolate, or by writing more asm-like code).

e.g. to tell the compiler to use a series of cmp instructions and arithmetic for that branch, you could try something like:

[branch]
if( 0 < step(a,b) * (step(c,d) + step(e,f)) )
Edited by Hodgman

Share this post


Link to post
Share on other sites
Yeah I meant actual dynamic variables in the shader functions (i.e. non-uniform ones).
You can have 16 'constant' (uniform) bools and 16 ints, but these types don't really exist at runtime - there's no instructions to operate on them. If you want to work with them, you'll be copying the results into temporary float registers.

As Washu mentions above, the 16 bool constants are generally used as a 16bit mask that controls static branching.

Share this post


Link to post
Share on other sites

Ok. If you don't mind i would like to reuse this topic for some questions.

1. So in some places i would like to avoid switching shaders and i see that this static bool (constant for shader execution but different on per drawcall basis) and is costing 1 instruction (it says that in docs), is it cheaper then switching shader and is it cheaper then dynamic branching?

2. I have this situation for parallel split shadow maps, which of these are cheaper:

this:

if(posVS.z > SplitDist.x && posVS.z <= SplitDist.y) // if in split range
{ ...lots_of_calculus() }

 

or this:

clip(posVS.z - SplitDist.x);
clip(SplitDist.y - posVS.z);
...lots_of_calculus()

Share this post


Link to post
Share on other sites
Thanks folks, I'll just work with the floats then! smile.png
 

1. So in some places i would like to avoid switching shaders and i see that this static bool (constant for shader execution but different on per drawcall basis) and is costing 1 instruction (it says that in docs), is it cheaper then switching shader and is it cheaper then dynamic branching?

This cannot be answered in the general case, because the shader operates per pixel (if it is a pixel shader), and the shader switching is per object. It is impossible to compare the two in anything but a specific situation, because the scene might be a few really large objects (tons of pixels, few shader switches) or a ton of small objects (few pixels, lots of shader switches).

Also, how long shader switching takes, depends on CPU, GPU, DirectX-version and drivers... Edited by Oogst

Share this post


Link to post
Share on other sites
While we are at the topic of shader assembly: is there a shader instruction equivalent to the Cg/HLSL/GLSL function saturate(x)? This function limits a value to the range [0, 1]. Since it is an intrinsic function in Cg, HLSL and GLSL, I expected to see it in shader assembly as well, but I couldn't find it...

Share this post


Link to post
Share on other sites

Saturate is supported as a modifier for certain instructions. So if you do something like x = saturate(y * z), you'll end up with a mul_sat instruction in your assembly.

This actually maps to how GPU's implement saturate in their native microcode.

Edited by MJP

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!