Archived

This topic is now archived and is closed to further replies.

using bit shifting to divide.

This topic is 5890 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I nead a really quick function to multiply and divide unsigned integers with 48. for multiplying with 48 i use. x= n<<5 + n<<4 I know it is easy to make a division for all numbers that are powers of 2, but i don''t know how to deal with a number like 48. I use bitshift because they used to be so much faster compared to divisions. Is this still the case with the current processors and advanced compilers?

Share this post


Link to post
Share on other sites
Shifts will either be faster or the same speed as multiplication and division, but never slower; so it''s a good idea to use shifts where possible.

Unfortunately, you can''t split division up in this way.

If you''re using fixed point math, then you could multiply the value by fixed (1/48). With 24 fractional bits, it''s accurate to the 5th decimal place.

As (bad) luck would have it, 1/48 in fixed is 10101010... The shift-equivalent of the multiplication in 8/24 fixed is "x + x << 2 + x << 4 + x << 6 + x << 8 + x << 10 + x << 12 + x << 14 + x << 16 + x << 18", which is unlikely to be faster than using the multiplication.

''Nuff said. I''ll enjoy watching you live, demon.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
quote:

Shifts will either be faster or the same speed as multiplication and division, but never slower; so it''s a good idea to use shifts where possible.



What if you had a FPU that did a million FLOPs in one clock cycle?

Share this post


Link to post
Share on other sites
If you are dividing by a constant a decent compiler will recognise this and turn it into a multiply if it works out faster to do it this way, and on all processors I know it is a lot faster. For fixed point it will also add the correct shift.

E.g. for

a2 = a0 / 48;

MIPS does something like (not sure of opcode mnemonics):

LOA a1, 0x05555555
MUL a0, a1
MOV a2, hi

Where the 0x05555555 is 2^32 / 48 and the final mov from hi copies the result from the high 32 bits of the 64 bit result (implicity shifting by 32 bits). This is as fast as multiplying and on all MIPS processors I''ve used far faster than integer/fixed divide.


Share this post


Link to post
Share on other sites
One problem with a discussion such as this is that it is basically outdated. It is based upon processors of ten years ago and not on a modern processor. Take the following for example:

  
for (i=0,j=0; (i < 1000000000) & (j < 1000000000); i++, j++);
//versus

for (i=0; i < 1000000000; i++);


Which of those two lines is fastest? Ten years ago answering that would be a no brainer. If you try it with a Pentium III or IV you might be surprised by the answer. If you read the Intel optimization manual you won''t see jack about shift versus multiply. The reason is that other things matter more. You will see extensive discussions about MMX and SIMD instructions, instruction pairing, branch prediction, cache utilization, etc. That is because those are the things that matter in a modern processor.

I don''t say this just from a perspective of theory. Rather I ran some tests on a 1.3ghz Pentium 4. All I can say from the results is that other things matter more. Some tests showed that shift and multiply take the same length of time, some showed shift faster and some showed multiply faster. I even had a test that showed that four shifts and three adds was faster than either a multiply or a single shift. Just to add to the entertainment value I found that doing nothing took longer than doing something, i.e. the loop took longer to simply copy the input to the output than to shift or multiply the input to produce the output. So what can I definitively say from about 40 scenerioes I tested? I can absolutely, positively, without any doubt whatsoever say that it depends. Specifically it depends on the loop and the rest of the code in the loop more than it depends upon which one you use.

Share this post


Link to post
Share on other sites