Jump to content

  • Log In with Google      Sign In   
  • Create Account

Bit Flags vs. Boolean


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
7 replies to this topic

#1 Captacha   Members   -  Reputation: 141

Like
0Likes
Like

Posted 10 October 2012 - 07:45 PM

I was trying to explain to a semi-programmer friend of mine today the purpose of Bit Flags vs. Booleans. And I couldn't think of any besides Memory Usage which seems rather weak seeing as it would save you a few 100 bytes of memory at most. So what's a could argument for Bit Flags I can tell him about, and to be honest I'm wondering myself right now too.

Sponsor:

#2 Telastyn   Crossbones+   -  Reputation: 3730

Like
1Likes
Like

Posted 10 October 2012 - 07:50 PM

It tends to be a bit more memory than that in most use cases. You often don't just have one bit flag, you have one bit flag per user/entity/product/etc. It adds up.

It also speeds certain operations since an equality check is 1 op rather than N. And/Or/Xor/Negate are also 1 op instead of N.

Reading and writing from disk/database/network is faster since there's less space and/or compression going on.

I'm sure there's a few others I'm missing.

#3 joew   Crossbones+   -  Reputation: 3679

Like
1Likes
Like

Posted 10 October 2012 - 08:00 PM

One example of a performance benefit would be doing something like render queue sorting based on a bitfield. It would be much better to sort based on a single field rather than a bunch of single flags.

#4 Bregma   Crossbones+   -  Reputation: 5469

Like
1Likes
Like

Posted 10 October 2012 - 08:45 PM

Bitfields were available in C because C was used to develop the Unix operating system, and operating systems deal with hardware. Hardware control registers often have bit-addressable functions.

C also did not have a boolean type. The PDP architeture it was first developed on has a (bit-addressable) flag register in which the Z flag gets set every time an operation yields a zero result (and cleared otherwise), so it was easier to have control-flow operations act on zero or non-zero integral values. BTW, that's also why null-terminated character strings were used instead of the more-common (at the time) Pascal-style counted strings.
Stephen M. Webb
Professional Free Software Developer

#5 SiCrane   Moderators   -  Reputation: 9670

Like
4Likes
Like

Posted 10 October 2012 - 09:13 PM

One particular application you might want to look at is Bitboards.

#6 joew   Crossbones+   -  Reputation: 3679

Like
0Likes
Like

Posted 10 October 2012 - 09:20 PM

One particular application you might want to look at is Bitboards.

This is probably the best example and is also what I was hinting at by using the bitboard concept to sort render ops via state, etc.

#7 Hodgman   Moderators   -  Reputation: 31852

Like
8Likes
Like

Posted 10 October 2012 - 09:58 PM

I couldn't think of any besides Memory Usage which seems rather weak seeing as it would save you a few 100 bytes of memory at most.

Saving memory can be important beyond just reducing RAM usage. Moving memory from RAM to the CPU is extremely slow, so shrinking data formats can be a good optimisation.

Let's say we've got a game object like:
struct Object { float x, y, z; bool active };//sizeof(Object)==16
Due to alignment issues, the compiler adds an extra 3 bytes to this structure to round it up from 13 to 16 bytes. So the bool is actually costing us 4 bytes*numObjects.
Now let's say we want to run this algorithm:
int countActive=0;
for( int i=0 i!=numObjects; ++i )
  countActive += objects[i].active ? 1 : 0;
My CPU has 64-byte cache lines (the amount of RAM that's transferred to the CPU in one operation). If we've got 1000 objects, then this algorithm has to fetch 250 cache lines (1000*16/64==250). Seeing that the CPU asm code is really simple, this operation will most likely be bottlenecked by the time it takes to perform those memory fetches -- let's fix this:
struct Pos { float x, y, z };//sizeof(Pos) == 12;
struct Objects { Pos xyz[1000]; bool active[1000]; }
...
int countActive=0;
for( int i=0 i!=1000; ++i )
  countActive += objects.active[i] ? 1 : 0;
Now we're only iterating through the bools, without the positions being interleaved. Furthermore the 3 padding bytes are gone. Now we've only got to fetch 16 cache lines (1000*1/64==15.625), which is way less memory bandwidth than before.
But we can do better:
struct Objects { Pos xyz[1000]; u32 active[(1000+31)/32]; }...
int countActive=0;
for( int i=0 i!=ArraySize(objects.active); ++i )
  countActive += CountBitsSet(objects.active[i]);
Now we only allocate 32 integers to store our bits, which is 128 bytes, which is a mere 2 cache lines. Now there's a much bigger chance that the actual loop's CPU asm code will be the bottleneck, instead of bottlenecking on RAM access speeds. We've reduced the original memory bandwidth requirement of the algorithm by over 100x!

Edited by Hodgman, 10 October 2012 - 10:01 PM.


#8 taz0010   Members   -  Reputation: 277

Like
2Likes
Like

Posted 13 October 2012 - 09:26 PM

One thing to keep in mind is that accessing bit fields will require the compiler to produce additional bit masking instructions, which will *increase* the memory footprint of your instruction code. If you try to use bit fields as a general rule instead of a situational optimisation, you'll end up with slower code.

Also, the alignment issues are subtle. Most types want to align on 4 byte boundaries, so AFAIK if you have a structure containing, say, an integer, and any less than 5 flags, then converting these flags to single bits won't save any space at all.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS