fast float number multiplication by 2 and 0.5

Started by
71 comments, last by DigitalDelusion 19 years, 9 months ago
Does anybody know a fast way (asm code) to perform a 32 bit float number multiplication by 2 or 0.5, or any integer power of 2? Or is there a special asm statement that does this, and the compiler does use it in release mode? Would shifting the mantissa work? Or is it not worth bothering with it?
Advertisement
I always thought shifting it was the fastest way, but I also thought that compilers can automatically generate a binary shift when they encounter those numbers (hard coded) in code. So if you write:

number *= 2;

it generates:
number <<= 1;

and the equivalent asm/fpu stuff ( I don't remember the fpu command for binary shift)

Why don't alcoholics make good calculus teachers?Because they don't know their limits!Oh come on, Newton wasn't THAT smart...
@shadow12345 : Shifts only work on (unsigned?) integers and not on floats as he wanted.

@szinkopa : I doubt there is an instruction for this. You might get away with some "hacking" on FP exponent (+1 for *2 and -1 for /2) but I highly doubt it's worth it. And as always: don't optimize until profiler tells you to. :)
You should never let your fears become the boundaries of your dreams.
To multiply by 2 you could always just do something like x += x so the FPU doesn't have to use its multiply instruction, but chances are the multiply instruction is either optimized for certain conditions on chip or the compiler will optimize that code for you, or both.
Ra
Quote:Original post by _DarkWIng_

@shadow12345 : Shifts only work on (unsigned?) integers and not on floats as he wanted.

@szinkopa : I doubt there is an instruction for this. You might get away with some "hacking" on FP exponent (+1 for *2 and -1 for /2) but I highly doubt it's worth it. And as always: don't optimize until profiler tells you to. :)


Thats not true. You just have to do some checking and error correcting after doing the bitshift. Of course this kills your speed gains, but a creative programmer can find ways to work around this issue.

I agree if its not a problem don't optimize it.
<a href="http://ruggles.hopto.org>My Personal Website
Quote:Original post by ChaosX2
Thats not true. You just have to do some checking and error correcting after doing the bitshift. Of course this kills your speed gains, but a creative programmer can find ways to work around this issue.

Can you give me an example how you would make *2 with shifts on floating point numbers. I've never seen that. Or were you talking about signed integers? If later then I know it can be done, I'm just saying I doubt it's worth the troubble.
You should never let your fears become the boundaries of your dreams.
OK thanks.

Yes, multiplying with 2 is easy and fast x+=x. The reason I posted, that I have a lot of .../2 or ...*0.5, and I thought there might be a way to do it faster by an inline function call like z=Half(x+y); or so. But it's not so crucial.
Once again the IEEE 754 standard might help. If your float is in memory, you can improve speed. Integer addition usually has less latency than the floating point multiplication.

A simple integer addition is required to multiply a float in memory by two (or 2^n) or divide it by two (or 2^n) :

float x;
//...
// Dividing by 2 just reducing exponent by one.
*(int*)&x=(*(int*)&x)-1L<<23L;

EDIT:
- Note that shifting the mantissa would not work.
- Such a trick could be used to replace a 3DNow mul by a faster MMX add.
"Coding math tricks in asm is more fun than Java"
Quote:Original post by Charles B
float x;
//...
// Dividing by 2 just reducing exponent by one.
*(int*)&x=(*(int*)&x)-1L<<23L;

Impressive!
You should never let your fears become the boundaries of your dreams.
One of the thousand math hacks I know ;) But really all you need here is really understand the IEEE 754 format. There is enough on the web to find numerous ideas. Such as quick fabs, etc... Although beware of such tricks, CPUs do not like store load forward dependencies. Specially when the store (fstp) and load (mov) is not of the same type. Prefer normal floating point code in general, today the FPUs are very fast.

So such advices are mostly relevant if you process float data already stored in memory. And array of vertices for instance.
"Coding math tricks in asm is more fun than Java"

This topic is closed to new replies.

Advertisement