C++ Interview Questions / leet bit-twiddling skillz

Started by
2 comments, last by fastcall22 12 years, 6 months ago
Original thread
I feel this discussion is worthy of a separate thread, as it does not particularly pertain to the discussion in the original thread.


No one broke out the leet bit-twiddling skillz? biggrin.gif
[font="Courier New"][size="1"]count = (((((v - ((v >> 1) & 0x55555555)) & 0x33333333) + (((v - ((v >> 1) & 0x55555555)) >> 2) & 0x33333333)) + ((((v - ((v >> 1) & 0x55555555)) & 0x33333333) + (((v - ((v >> 1) & 0x55555555)) >> 2) & 0x33333333)) >> 4) & 0xF0F0F0F) * 0x1010101) >> 24;[/font]


Okay, I need to know.

Here's my naive approach, in 20 operations (*):

unsigned count_bits( unsigned x ) {
x = ((x & 0x55555555) + ((x >> 1) & 0x55555555));
x = ((x & 0x33333333) + ((x >> 2) & 0x33333333));
x = ((x & 0x0F0F0F0F) + ((x >> 4) & 0x0F0F0F0F));
x = ((x & 0x00FF00FF) + ((x >> 8) & 0x00FF00FF));
x = ((x & 0x0000FFFF) + ((x >> 16) & 0x0000FFFF));
return x;
}

Since a bit is, in itself, the number of bits in that bit, we can use 16 two-bit accumulators to count the number of bits are in each 2-bit group. This is done by selecting every other bit and adding it with it's neighbor. Then each pair of 2-bit accumulators are accumulated into 4-bit accumulators, and so on, until we get our result in the 16-bit accumulator.


Hodgman's sample code (taken from http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetNaive), however, reduces to 12 operations:

unsigned count_bits_reduced( unsigned x ) {
unsigned z = (x - ((x >> 1) & 0x55555555));
unsigned y = ((z & 0x33333333) + ((z >> 2) & 0x33333333));

return ((y + (y >> 4) & 0xF0F0F0F) * 0x1010101) >> 24;
}

My questions are:
1) What advantages are gained by negating the value in: [font="Courier New"](x - ((x >> 1) &0x55555555)[/font]?
2) Why the multiplication of [font="Courier New"]0x1010101[/font], and the r-shift of 24?
3) Does this trickery scale to 64-bit integers? Or does it only apply to 32-bit integers?

[size="1"](*) Most likely not a very good base to make assumptions about speed, size, or performance
Advertisement
N.B. I shamelessly stole the algorithm from this famous page. I take no credit for it's magic.
[The 12 operation method] is a hybrid between the purely parallel method above and the earlier methods using multiplies (in the section on counting bits with 64-bit instructions), though it doesn't use 64-bit instructions. The counts of bits set in the bytes is done in parallel, and the sum total of the bits set in the bytes is computed by multiplying by 0x1010101 and shifting right 24 bits.[/quote]

1) What advantages are gained by negating the value in: [font="Courier New"](x - ((x >> 1) &0x55555555)[/font]?

That's a clever way of computing the bit count of every 2-bit box. It's probably not better in practice than x = ((x & 0x55555555) + ((x >> 1) & 0x55555555));

2) Why the multiplication of [font="Courier New"]0x1010101[/font], and the r-shift of 24?[/quote]
If you think of your 32-bit number as a 4-digit number in base 256, you are computing
a b c d
* 1 1 1 1
-------
a b c d
a b c d
a b c d
a b c d
-------------
e f g h i j k

e, f and g are lost in overflow, so by shifting 24 places to the right, you recover h, and h = a + b + c + d. Notice that a, b, c and d are at most 8, so you won't have carry problems.

3) Does this trickery scale to 64-bit integers?[/quote]
Yes, and if you understand the 32-bit code, you should be able to write the 64-bit version yourself.
Hrrmm drr hrrm, forgot to reply. :P


That's a clever way of computing the bit count of every 2-bit box. It's probably not better in practice than x = ((x & 0x55555555) + ((x >> 1) & 0x55555555));

Makes sense.


If you think of your 32-bit number as a 4-digit number in base 256, you are computing
[...]
e, f and g are lost in overflow, so by shifting 24 places to the right, you recover h, and h = a + b + c + d.[/quote]
Yeah, it finally clicked for me while I was showering that night. (Oh man, realizing a solution to a computer science problem is just an incredible feeling!)


Yes, and if you understand the 32-bit code, you should be able to write the 64-bit version yourself.
[/quote]
The main source of confusion was from the multiplication of 0x1010101, which I kept seeing as a decimal magic number. "What significance does multiplying a number by 16843009 have? It just seems so arbitrary!"

Thanks again for clearing this up. :D

This topic is closed to new replies.

Advertisement