Fast exp2() function in shader

Started by
2 comments, last by MJP 10 years, 10 months ago

I'm writing a fairly extensive pixel shader, using D3D11 and targetting shader model 4. As it stands I currently mix float's and int's a fair amount. I'm under the assumption, but haven't been able to find any definitive information, that casting between int and float in a pixel shader would incur some performance overhead. I could simplify a lot of the code and change it all to float's, but in a few places I need to do a fast 2^x multiply or divide, which with int's can be performed with a bit-shift.

Is there anyway to do a fast 2^x multiply with floats? Or, I was under the impression that with older hlsl targets that ints were just emulated with the mantissa portion of the float and perhaps that's still the case and casting between and float and int is simply a signal to the compiler and actually doesn't require an op?

I'd like to write both versions and give it a try, but its fairly extensive and at the moment I don't have the time. Given the circumstances what would you choose?

Advertisement

exp2
http://msdn.microsoft.com/en-us/library/windows/desktop/bb509596%28v=vs.85%29.aspx

The only way to find out the answer to your question is to test performance. You can also get an idea by looking at the token assembly of your shader.

Also, integers operations were emulated with floats in DirectX 9, but 10 requires full integer support. I don't see how they could simulate it with a float, and if they did you would notice a huge performance impact.

exp2
http://msdn.microsoft.com/en-us/library/windows/desktop/bb509596%28v=vs.85%29.aspx

The only way to find out the answer to your question is to test performance. You can also get an idea by looking at the token assembly of your shader.

Also, integers operations were emulated with floats in DirectX 9, but 10 requires full integer support. I don't see how they could simulate it with a float, and if they did you would notice a huge performance impact.

Seconded. Just use exp2 for now and worry about performance later. Unless you've got profiling information to back up your "I need a faster exp2" desire, you're indulging in pre-emptive optimization.

Most modern hardware (even quite low-end stuff) will crunch through ALU operations and come back looking for more; you may find it very difficult to slow things down. ALU can even be free if the compiler determines that it can overlap ALU with texture lookups (and sometimes even just a simple code re-ordering can be enough). I doubt if any home-grown replacement can outperform the built-in instruction.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

Almost all GPU's have a native exp2 instruction in their ALU's, so you're not going to make a faster version on your own. Converting to integer does often have a performance cost, and on most GPU's integer instructions run at 1/2 or 1/4 rate which means its unlikely you'll get better performance with bit shifts. You'll have to check the available docs on various architectures to find out the specifics.

This topic is closed to new replies.

Advertisement