Smart way to count positive entries in bitset

Started by
12 comments, last by Ripiz 11 years, 6 months ago
@Alvaro & @wqking:

That rocks!!
Thanks a lot :]
Advertisement

This code (or similar) is somewhere in that bithacks page wqking linked:
unsigned popcount(unsigned x) {
x -= (x >> 1) & 0x55555555u;
x = (x & 0x33333333u) + ((x >> 2) & 0x33333333u);
x = (x + (x >> 4)) & 0x0f0f0f0fu;
return (x * 0x01010101u) >> 24; // assumes unsigned is 32 bits
}


That function scares me... :)

[quote name='alvaro' timestamp='1349445503' post='4987115']
This code (or similar) is somewhere in that bithacks page wqking linked:
unsigned popcount(unsigned x) {
x -= (x >> 1) & 0x55555555u;
x = (x & 0x33333333u) + ((x >> 2) & 0x33333333u);
x = (x + (x >> 4)) & 0x0f0f0f0fu;
return (x * 0x01010101u) >> 24; // assumes unsigned is 32 bits
}


That function scares me... smile.png
[/quote]

Yeah, it is kind of scary... Let me see if I can break it down for you:

The general plan is to treat the unsigned integer initially as 32 integers of size 1 bit each. We'll pair up these 1-bit numbers and add them. The result is 16 integers of size 2 bits. We'll then pair those up, add them together and get 8 integers of size 4 bits. Then pair them up, add the pairs together and we'll have 4 integers of size 8 bits. At this point we use one final trick to compute the sum of all 4, and that's our answer.

For the first step, we need to map every 2 bits by the following table:
00 -> 00
01 -> 01
10 -> 01
11 -> 10

Notice that in the first two cases we aren't changing anything, and in the other two, we are just subtracting one. So we can do that by just computing `x-=x>>1' (if x were a 2-bit integer). Since we are doing this 16 times in parallel, we need to select the appropriate bits after the shift:
x -= (x >> 1) & 0x55555555u; // 0x55555555u is 01010101010101010101010101010101 in binary

So now x can be interpreted as 16 2-bit integers, whose sum we are trying to compute. The next two steps are very similar, and easier to understand than the previous one:
x = (x & 0x33333333u) + ((x >> 2) & 0x33333333u);
x = (x + (x >> 4)) & 0x0f0f0f0fu;


By now x can be seen as 4 8-bit integers. In order to compute their sum, let's see what happens when we multiply ABCD times 1111 (in base 256):

ABCD
x 1111
------
ABCD
ABCD
ABCD
ABCD
-------
abcdefg

Notice that there is no carry in this sum, because all the digits A, B, C and D are at most 8. Since we are doing this whole thing in 32-bit arithmetic, abc is lost to overflow, and the result of the multiplication is really just defg. As you see d=A+B+C+D, and we can extract it with a shift 24 bits to the right.

return (x * 0x01010101u) >> 24; // assumes unsigned is 32 bits

Are you less scared now? ;)

[quote name='Ripiz' timestamp='1349433633' post='4987071']
Written in Notepad, it may or may not compile.
Also it's not tested, it may or may not be faster/slower, that's up to you to test.


It's also incorrect, you would need to test against:

0x1, 0x2, 0x4, 0x8, 0x10, 0x20, 0x40, 0x80.
[/quote]

Oh my... What have I done. Sometimes I wish I could downvote myself.

This topic is closed to new replies.

Advertisement