#### Archived

This topic is now archived and is closed to further replies.

# division through a constant

This topic is 5047 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Is it possible to optimize a division through a non-power of two, for example 48, with bitshifts? If so, will a compiler like Mingw do this automaticly with constants? [edited by - Boops on February 20, 2004 4:33:18 PM]

##### Share on other sites
Have you profiled to see if its even a problem? Unless you''re programming something embedded, it shouldn''t be.

##### Share on other sites
Generally you want multiplication, not division. So you
always divide by 48? Turn it into multiply by 1/48.

Kami no Itte ga ore ni zettai naru!

##### Share on other sites
I''m using pure integer math, can''t multiply by 1/48

Well it''s not really the biggest problem, but anything can help since it''s inside a big double loop and it''s the only division, would be cool if I could get rid of it.

##### Share on other sites
Bitshifts won''t get you around the fundamental problem of needing to divide by some factor of 3.

Don''t worry about it too much. Especially if you haven''t profiled yet.

"Sneftel is correct, if rather vulgar." --Flarelocke

##### Share on other sites
just convert to base 3, then you can bitshift. If your conversion function is really fast, it''s probably faster to convert and shift than to take the hit of a DIV. Especially if you optimize it with lookup tables and ASM.

##### Share on other sites
quote:
Original post by sjelkjd
just convert to base 3, then you can bitshift. If your conversion function is really fast, it''s probably faster to convert and shift than to take the hit of a DIV. Especially if you optimize it with lookup tables and ASM.
Doesn''t converting bases involve modulus and division both?

##### Share on other sites
Converting an integer divide into a series of adds and bitshifts is possible, but is usually fairly pointless (if not worse than pointless). In order to go about it systematically you need to consider the analagous situation for converting multiplications to bitshifts. For the example situation we have X / 48. With the rules of mathematics it''s basically the same as X * (1 / 48). So then we go through this tedious reduction:

X * (1 / 48) =
X * ((1 / 64) + (1 / 192) =
X * ((1 / 64) + (1 / 256) + (1 / 768)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 3072)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 12288)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 49152)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 196608)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 262144) + (1 / 786432)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 262144) + (1 / 1048576) + (1 / 3145728)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 262144) + (1 / 1048576) + (1 / 4194304) + (1 / 12582912)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 262144) + (1 / 1048576) + (1 / 4194304) + (1 / 16777216) + (1 / 50331648)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 262144) + (1 / 1048576) + (1 / 4194304) + (1 / 16777216) + (1 / 67108864) + (1 / 201326592)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 262144) + (1 / 1048576) + (1 / 4194304) + (1 / 16777216) + (1 / 67108864) + (1 / 268435456) + (1 / 805306368)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 262144) + (1 / 1048576) + (1 / 4194304) + (1 / 16777216) + (1 / 67108864) + (1 / 268435456) + (1 / 1073741824) + (1 / 3221225472)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 262144) + (1 / 1048576) + (1 / 4194304) + (1 / 16777216) + (1 / 67108864) + (1 / 268435456) + (1 / 1073741824) + (1 / 3221225472)) =
X * ((1 / 64) + (1 / 256) + (1 / 1024) + (1 / 4096) + (1 / 16384) + (1 / 65536) + (1 / 262144) + (1 / 1048576) + (1 / 4194304) + (1 / 16777216) + (1 / 67108864) + (1 / 268435456) + (1 / 1073741824) + (1 / 4294967296) + (1 / 12884901888)) =
(X / 64) + (X / 256) + (X / 1024) + (X / 4096) + (X / 16384) + (X / 65536) + (X / 262144) + (X / 1048576) + (X / 4194304) + (X / 16777216) + (X / 67108864) + (X / 268435456) + (X / 1073741824) + (X / 4294967296) + (X / 12884901888)) =
(X >> 6) + (X >> 8) + (X >> 10) + (X >> 12) + (X >> 14) + (X >> 16) + (X >> 18) + (X >> 20) + (X >> 22) + (X >> 24) + (X >> 26) + (X >> 28) + (X >> 30) + (X >> 32) + (X / 12884901888)) =
(X >> 6) + (X >> 8) + (X >> 10) + (X >> 12) + (X >> 14) + (X >> 16) + (X >> 18) + (X >> 20) + (X >> 22) + (X >> 24) + (X >> 26) + (X >> 28) + (X >> 30)

(The last terms get discarded on 32 bit systems because division by such small fractions will always equal 0) So trying to replace a division by 48 with bitshifts and adds will create 13 shifts and 12 additions. Double that for 64 bit systems. This seems very unlikely to be done by the processor more quickly than a division instruction and very likely to cause code bloat.

##### Share on other sites
quote:
Original post by Extrarius
quote:
Original post by sjelkjd
just convert to base 3, then you can bitshift. If your conversion function is really fast, it''s probably faster to convert and shift than to take the hit of a DIV. Especially if you optimize it with lookup tables and ASM.
Doesn''t converting bases involve modulus and division both?

SHHHHH!!!!!!!!!!!!

##### Share on other sites
quote:
Original post by SiCrane
So then we go through this tedious reduction...

ROFLMAO! Did you use Maple/Mathematica/Derive? Or did you
do it by hand? Typing it all out...

Kami no Itte ga ore ni zettai naru!

##### Share on other sites
That was by hand. Though the infinite series of powers of two that makes up 1/3 is very well behaved, so it was a lot easier to do than it looks.

##### Share on other sites
Woot, thanks for pointing that out

I guess I''ll stick with x/48

##### Share on other sites
What is the range of values for X? If they're not too large, you can use SiCrane's great code, just use the first few shifts and discard the rest. His algorithm begins:

(X >> 6) + (X >> 8) + (X >> 10) + (X >> 12)

If X is always less than 4096, X >> 12 will always be zero, as will all the terms to the right of it in the original algorithm. Therefore you can just use:

(X >> 6) + (X >> 8) + (X >> 10)

And get perfectly accurate results every time. And that may be faster than doing the division. Profile it.

EDIT: Nevermind, SiCrane is wrong. He forgot he's dealing with integer math. For example, take 48*7 = 336. Passed through his algorithm:

(336/64) + (336/256) + (336/1024) ...

Taken as integers, you get 5+1+0 = 6. Taken as floating-point numbers, you get 5.25 + 1.3125 + 0.328125 = 6.89, and doing the other divisions/additions will bring it to exactly 7.

~CGameProgrammer( );

[edited by - CGameProgrammer on February 21, 2004 8:01:25 AM]

##### Share on other sites
No, I remembered that this was integer math. It''s just that it''s easier to explain the derivation of the broken formula than it is to explain the derivation of the actual correct bitshift division method. And the broken formula is sufficient to explain why it''s a bad idea.

If you want an actual correct division by 48 method using shifts, here it is (for 32 bit unsigned ints):

(((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - ((n - 0) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1)) >> 1) >> 4)