Archived

This topic is now archived and is closed to further replies.

division through a constant

This topic is 5047 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Is it possible to optimize a division through a non-power of two, for example 48, with bitshifts? If so, will a compiler like Mingw do this automaticly with constants? [edited by - Boops on February 20, 2004 4:33:18 PM]

Share this post


Link to post
Share on other sites
I''m using pure integer math, can''t multiply by 1/48

Well it''s not really the biggest problem, but anything can help since it''s inside a big double loop and it''s the only division, would be cool if I could get rid of it.

Share this post


Link to post
Share on other sites
Bitshifts won''t get you around the fundamental problem of needing to divide by some factor of 3.

Don''t worry about it too much. Especially if you haven''t profiled yet.


"Sneftel is correct, if rather vulgar." --Flarelocke

Share this post


Link to post
Share on other sites
just convert to base 3, then you can bitshift. If your conversion function is really fast, it''s probably faster to convert and shift than to take the hit of a DIV. Especially if you optimize it with lookup tables and ASM.

Share this post


Link to post
Share on other sites
quote:
Original post by sjelkjd
just convert to base 3, then you can bitshift. If your conversion function is really fast, it''s probably faster to convert and shift than to take the hit of a DIV. Especially if you optimize it with lookup tables and ASM.
Doesn''t converting bases involve modulus and division both?

Share this post


Link to post
Share on other sites
Converting an integer divide into a series of adds and bitshifts is possible, but is usually fairly pointless (if not worse than pointless). In order to go about it systematically you need to consider the analagous situation for converting multiplications to bitshifts. For the example situation we have X / 48. With the rules of mathematics it''s basically the same as X * (1 / 48). So then we go through this tedious reduction:

X * (1 / 48) =
X * ((1 / 64) + (1 / 192) =
X * ((1 / 64) + (1 / 256) + (1 / 768)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 3072)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 12288)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 49152)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 196608)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 262144) + (1 / 786432)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 262144) + (1 / 1048576) + (1 / 3145728)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 262144) + (1 / 1048576) + (1 / 4194304) + (1 / 12582912)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 262144) + (1 / 1048576) + (1 / 4194304) + (1 / 16777216) + (1 / 50331648)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 262144) + (1 / 1048576) + (1 / 4194304) + (1 / 16777216) + (1 / 67108864) + (1 / 201326592)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 262144) + (1 / 1048576) + (1 / 4194304) + (1 / 16777216) + (1 / 67108864) + (1 / 268435456) + (1 / 805306368)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 262144) + (1 / 1048576) + (1 / 4194304) + (1 / 16777216) + (1 / 67108864) + (1 / 268435456) + (1 / 1073741824) + (1 / 3221225472)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 262144) + (1 / 1048576) + (1 / 4194304) + (1 / 16777216) + (1 / 67108864) + (1 / 268435456) + (1 / 1073741824) + (1 / 3221225472)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 262144) + (1 / 1048576) + (1 / 4194304) + (1 / 16777216) + (1 / 67108864) + (1 / 268435456) + (1 / 1073741824) + (1 / 4294967296) + (1 / 12884901888)) =
(X / 64) + (X / 256) + (X / 1024) + (X / 4096) + (X / 16384) + (X / 65536) + (X / 262144) + (X / 1048576) + (X / 4194304) + (X / 16777216) + (X / 67108864) + (X / 268435456) + (X / 1073741824) + (X / 4294967296) + (X / 12884901888)) =
(X >> 6) + (X >> 8) + (X >> 10) + (X >> 12) + (X >> 14) + (X >> 16) + (X >> 18) + (X >> 20) + (X >> 22) + (X >> 24) + (X >> 26) + (X >> 28) + (X >> 30) + (X >> 32) + (X / 12884901888)) =
(X >> 6) + (X >> 8) + (X >> 10) + (X >> 12) + (X >> 14) + (X >> 16) + (X >> 18) + (X >> 20) + (X >> 22) + (X >> 24) + (X >> 26) + (X >> 28) + (X >> 30)

(The last terms get discarded on 32 bit systems because division by such small fractions will always equal 0) So trying to replace a division by 48 with bitshifts and adds will create 13 shifts and 12 additions. Double that for 64 bit systems. This seems very unlikely to be done by the processor more quickly than a division instruction and very likely to cause code bloat.

Share this post


Link to post
Share on other sites
quote:
Original post by Extrarius
quote:
Original post by sjelkjd
just convert to base 3, then you can bitshift. If your conversion function is really fast, it''s probably faster to convert and shift than to take the hit of a DIV. Especially if you optimize it with lookup tables and ASM.
Doesn''t converting bases involve modulus and division both?


SHHHHH!!!!!!!!!!!!

Share this post


Link to post
Share on other sites
quote:
Original post by SiCrane
So then we go through this tedious reduction...



ROFLMAO! Did you use Maple/Mathematica/Derive? Or did you
do it by hand? Typing it all out...



Kami no Itte ga ore ni zettai naru!

Share this post


Link to post
Share on other sites
That was by hand. Though the infinite series of powers of two that makes up 1/3 is very well behaved, so it was a lot easier to do than it looks.

Share this post


Link to post
Share on other sites
What is the range of values for X? If they're not too large, you can use SiCrane's great code, just use the first few shifts and discard the rest. His algorithm begins:

(X >> 6) + (X >> 8) + (X >> 10) + (X >> 12)

If X is always less than 4096, X >> 12 will always be zero, as will all the terms to the right of it in the original algorithm. Therefore you can just use:

(X >> 6) + (X >> 8) + (X >> 10)

And get perfectly accurate results every time. And that may be faster than doing the division. Profile it.

EDIT: Nevermind, SiCrane is wrong. He forgot he's dealing with integer math. For example, take 48*7 = 336. Passed through his algorithm:

(336/64) + (336/256) + (336/1024) ...

Taken as integers, you get 5+1+0 = 6. Taken as floating-point numbers, you get 5.25 + 1.3125 + 0.328125 = 6.89, and doing the other divisions/additions will bring it to exactly 7.

~CGameProgrammer( );

Screenshots of your games or desktop captures -- Post screenshots of your projects. There's already 134 screenshot posts.

[edited by - CGameProgrammer on February 21, 2004 8:01:25 AM]

Share this post


Link to post
Share on other sites
No, I remembered that this was integer math. It''s just that it''s easier to explain the derivation of the broken formula than it is to explain the derivation of the actual correct bitshift division method. And the broken formula is sufficient to explain why it''s a bad idea.

If you want an actual correct division by 48 method using shifts, here it is (for 32 bit unsigned ints):

(((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - 0) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1) >> 4)

Share this post


Link to post
Share on other sites
> I''m using pure integer math, can''t multiply by 1/48
Oh, but you can! Use fixed-point arithmetic. Derivation:
quote:
Problem: calculate a/b. If we can''t shift or divide, we have to multiply a by (1/b). First thought: let r = c/b, then q = (r * a) / c (c > a,b). Unfortunately, q <= a/b, due to truncation during the divisions. We can avoid this by ''padding'' the numerator: r = ((c+b-1)/b). Finally, choose c = 0x100000000, and we can divide by c for free - the "mul r" instruction places the upper 32 bits of the product into edx, which is then equal to a/b.

(from a previous thread)

Share this post


Link to post
Share on other sites