• Create Account

Working with a C++ data type on a bit by bit level

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

29 replies to this topic

#1Dan Berliner  Members

Posted 06 April 2011 - 01:55 PM

I am using C++, I have no decided on a graphics library so I am open to anything that works on Windows and *nix if it makes my life easier. This is more of a theory question though so I don't think it will be relevant which one I am using.

While it is not a game, I am writing a program that allows the user to create a one color overlay over an image with a paintbrush. My first major issue is storing all that data, the program will mainly be used on projectors running at sizes as large as 1920x1080, meaning that it will use a minimum of 1.98mb of data at any given time. My issue is this: there is no one bit data type, the smallest C++ data type is 32 bits. While 1.98mb of data in memory isn't an issue at any level, having 63.4mb of data would be just inefficient. As an electrical engineer I am familiar with binary and know that each int can hold 32 pixels of data, my problem is working with an int in terms of bits.

My question is how would I discern which bits are what? I would need (for instance) the number 2800169371 to correspond to its binary value, 10100110111001110011000110011011, where 1 is on and 0 is off. I would have to go though each bit and see if it on or off.

#2Antheus  Members

Posted 06 April 2011 - 02:04 PM

there is no one bit data type, the smallest C++ data type is 32 bits.

int is 32 bits.
char is 8 bits.
short is 16.

my problem is working with an int in terms of bits.

See std::vector<bool>.

It should be specialized to use one bit per value, while working like a regular array.

#3Dan Berliner  Members

Posted 06 April 2011 - 02:12 PM

int is 32 bits.
char is 8 bits.
short is 16.

My mistake there

See std::vector<bool>.

It should be specialized to use one bit per value, while working like a regular array.

I recall reading somewhere that bool was longer than 1 bit, after googling it again I see that I was incorrect. Thanks.

For people who may come here looking for the original requested solution (perhaps for a different issue), the following function will tell you if a bit is on or off in an int data type (you can modify it to use any fundamental data type:
bool isBitOn(int place, int &subject)
{
return ((subject & 1<<place)==0 ? false : true);
}


#4Nanoha  Members

Posted 06 April 2011 - 02:13 PM

Its usually easier to use a seperate variable for each component and combining them, than remembering huge numbers for each colour.

unsigned int red = 128;
unsigned int green = 0;
unsigned int blue = 255;
unsigned int alpha = 255;

You can put them all together into an int with some bitwise operators

unsigned int colour = 0;
colour |= (red << 24);
colour |= (green << 16);
colour |= (blue << 8);
colour |= |= alpha;

You can do the opposite to get each individua l component back

red = 0xff000000 & colour; // only want the red bits, so mask the others out
red = red >> 24;

green = 0x00ff0000 & colour;
green = green >> 16;

I'm slightly rusty so it may have mistakes. Your just packing each different component into one int, most significant bytes being red in what I've done. Of course it all depends on what bytes are considered to be what, I've gone for RGBA but you see lots of combination (BRGA etc). No need to deal with individual bits (unless you have to). You can do individual bits in a similar way. Take a look at bitwise operators http://www.cprogramming.com/tutorial/bitwise_operators.html .

Interested in Fractals? Check out my App, Fractal Scout, free on the Google Play store.

#5Dan Berliner  Members

Posted 06 April 2011 - 02:24 PM

Its usually easier to use a seperate variable for each component and combining them, than remembering huge numbers for each colour.

unsigned int red = 128;
unsigned int green = 0;
unsigned int blue = 255;
unsigned int alpha = 255;

You can put them all together into an int with some bitwise operators

unsigned int colour = 0;
colour |= (red << 24);
colour |= (green << 16);
colour |= (blue << 8);
colour |= |= alpha;

You can do the opposite to get each individua l component back

red = 0xff000000 & colour; // only want the red bits, so mask the others out
red = red >> 24;

green = 0x00ff0000 & colour;
green = green >> 16;

I'm slightly rusty so it may have mistakes. Your just packing each different component into one int, most significant bytes being red in what I've done. Of course it all depends on what bytes are considered to be what, I've gone for RGBA but you see lots of combination (BRGA etc). No need to deal with individual bits (unless you have to). You can do individual bits in a similar way. Take a look at bitwise operators http://www.cprogramm..._operators.html .

The program don't allow for more than black and transparent so a 1 bit value per pixel is all that is needed. Thanks for the explanation though.

#6Álvaro  Members

Posted 06 April 2011 - 02:56 PM

I recall reading somewhere that bool was longer than 1 bit, after googling it again I see that I was incorrect. Thanks.

bool does indeed take more than one bit. At the very least it takes one byte, and it may take more. However, std::vector<bool> is implemented in some clever way and will store bits compactly, just as you want.

#7Dan Berliner  Members

Posted 06 April 2011 - 03:28 PM

I recall reading somewhere that bool was longer than 1 bit, after googling it again I see that I was incorrect. Thanks.

bool does indeed take more than one bit. At the very least it takes one byte, and it may take more. However, std::vector<bool> is implemented in some clever way and will store bits compactly, just as you want.

Great, so now I cant read. After looking another time I see it is 1 byte, not one bit.

I will go with std::vector<bool>, I currently have a proof of concept written up with simple bools in an array and it is seeing some performance issues on the slow computer running it.

#8gretty  Members

Posted 06 April 2011 - 03:53 PM

What about the std::bitset data type. I have never used it but I thought this was specifically to work with bits.

So you can do std::bitset <32> myInt;

This is what c++ says about bitset

A bitset is a special container class that is designed to store bits (elements with only two possible values: 0 or 1, true or false, ...).

http://www.cplusplus.com/reference/stl/bitset/

#9Antheus  Members

Posted 06 April 2011 - 05:05 PM

What about the std::bitset data type. I have never used it but I thought this was specifically to work with bits.

So you can do std::bitset <32> myInt;

This is what c++ says about bitset

A bitset is a special container class that is designed to store bits (elements with only two possible values: 0 or 1, true or false, ...).

http://www.cplusplus...nce/stl/bitset/

Size is defined at compile time.

bitset is:
template < size_t N > struct bitset {
char data[N / 8 + 1];
};

Interestingly enough, this would be something where JIT-based languages could run circles around C++. Unfortunately, Java refuses to add value objects. But it would allow a given instance to generate fully specialized and static code based on run-time parameters.

#10Dan Berliner  Members

Posted 06 April 2011 - 06:59 PM

What about multi dimensional vectors? Would the same advantage apply to a multi dimensional vector? The math gets a little more annoying if it is one dimensional going over a two dimensional screen.

#11Álvaro  Members

Posted 06 April 2011 - 08:25 PM

What about multi dimensional vectors? Would the same advantage apply to a multi dimensional vector? The math gets a little more annoying if it is one dimensional going over a two dimensional screen.

The simplest way to deal with multi-dimensional arrays is to use an underlying one-dimensional array and translate your multiple indices into a single index.

std::vector<bool> image(false, 1920*1080);
image[x+y*1080] = true;

#12Trienco  Members

Posted 06 April 2011 - 10:50 PM

int is 32 bits.
char is 8 bits.
short is 16.

Actually char is 8bit in 99% of all environments, and from there it's char < short <= int (not perfectly sure about about where it's < and <=). You will most likely find short to be 16bit most of the time and long to be 32bit, while int is the "natural" size for your platform (which happens to be 64bit if you compile for 64bit and 16bit for some old 16bit machines). Never make assumptions about the size of types, that's why there's a stdint header with all the int8_t typedefs (and a ton of #ifdefs).

unsigned int colour = 0;
colour |= (red << 24);
colour |= (green << 16);
colour |= (blue << 8);
colour |= |= alpha;

You realize that you can let the compiler do that kind of thing? (Including a somewhat silly use of union for interchangeable usage.

union
{
uint32_t color;
struct
{
uint32_t red : 8;
uint32_t green : 8;
uint32_t blue : 8;
uint32_t alpha : 8;
}
};


Though I honestly see little point in doing that over using using one uint8_t per channel in the first place.

While I would stick with vector<bool>, it wouldn't be that hard to use an array and get the bits yourself. Depending on the used type (unsigned int might be the most efficient), all you have to do is use your bit index and get the word and bit of that word ( i/sizeof(unsigned) and i%sizeof(unsigned) ). Somebody will point out that "it's teh fasta using bit fiddling", ie i>>4 and i&0xfffffff, but I haven't seen a compiler for a while that wasn't smart enough to turn / and % into bit operations for power of two operands.
f@dzhttp://festini.device-zero.de

#13SiCrane  Moderators

Posted 06 April 2011 - 11:46 PM

Actually char is 8bit in 99% of all environments, and from there it's char < short <= int (not perfectly sure about about where it's < and <=).

In the current version of the C++ standard, it's char <= short <= int <= long with 8 <= char, 16 <= short, 32 <= long. It's perfectly possible for an implementation to have 32-bit everything (and at least one annoying embedded platform is that way).

#14taz0010  Members

Posted 07 April 2011 - 03:55 AM

I recall reading somewhere that bool was longer than 1 bit, after googling it again I see that I was incorrect. Thanks.

sizeof(bool) is generally equal to sizeof(int), due to the fact that most platforms are unable to access individual bits and have alignment restrictions on other datatypes. However std::vector<bool> is specifically designed to pack the elements as tightly as possible, and incurs an instruction overhead in doing so. If you need to set a lot of bits at once and are worried about performance, you might want to avoid using the subscript operator.

#15rip-off  Moderators

Posted 07 April 2011 - 04:02 AM

sizeof(bool) is generally equal to sizeof(int), due to the fact that most platforms are unable to access individual bits and have alignment restrictions on other datatypes.

I think you are mistaking the padded size of the variable with the sizeof(variable).

sizeof(bool) is normally equal to sizeof(char), which is defined as 1 and is most commonly 8 bits. However, the compiler padding will often mean that an individual boolean value is allocated a word of space by itself. However, contiguous booleans will be tightly packed by most compilers.

#16Dan Berliner  Members

Posted 07 April 2011 - 05:31 PM

What about multi dimensional vectors? Would the same advantage apply to a multi dimensional vector? The math gets a little more annoying if it is one dimensional going over a two dimensional screen.

The simplest way to deal with multi-dimensional arrays is to use an underlying one-dimensional array and translate your multiple indices into a single index.

std::vector<bool> image(false, 1920*1080);
image[x+y*1080] = true;

I've done that, but just to iterate though an 800*600 (480,000) element vector takes my computer about 570-590ms. I have a pretty high end machine so that's unacceptably slow. It's just a simple iteration loop where it adds one to a int variable; the actual program will have to do significantly more than that per loop. When I was using an array of bool for the same sized screen it took about 47ms while drawing. Unless there is some way to optimize this considerably I dont see vectors as being a possible solution even if they can do the one bit thing.

#17SiCrane  Moderators

Posted 07 April 2011 - 06:26 PM

Which compiler did you use for that benchmark? Some compilers create extra code even in release mode when using standard library templates. For example some versions of MSVC will enable _SECURE_SCL in release, which can significantly slow things down.

#18Dan Berliner  Members

Posted 07 April 2011 - 06:56 PM

Which compiler did you use for that benchmark? Some compilers create extra code even in release mode when using standard library templates. For example some versions of MSVC will enable _SECURE_SCL in release, which can significantly slow things down.

I'm using MSVC, I'll look into optimizing on that compiler.

#19doesnotcompute  Members

Posted 07 April 2011 - 07:27 PM

Would something like this work?



template <int SIZE_IN_BITS>
class BitArray
{
char bytes[SIZE_IN_BITS/8];

public:

BitArray()
{
memset(bytes,0,sizeof(bytes));
}

void SetBit(int bit)
{
int index = bit / 8;
int offset = bit % 8;

char mask = 1 << offset;

}

void ClearBit(int bit)
{
int index = bit / 8;
int offset = bit % 8;

char mask = 0xFF & ~(1 << offset);

}

bool GetBit(int bit)
{
int index = bit / 8;
int offset = bit % 8;

char mask = 1 << offset;

}
};



EDIT: You probably want to make the internal array heap allocated if it's going to be large but the idea is the same...

#20Trienco  Members

Posted 07 April 2011 - 10:50 PM

I've done that, but just to iterate though an 800*600 (480,000) element vector takes my computer about 570-590ms.

Not saying it's the one and only reason, but if you want it to be FAST, you should go back to using int. Something that neatly fits the machines register size and doesn't require bitmasking and shifting for every access (though I'd expect no shifting and simply comparing with 0 after masking).

Keep in mind that the image you want to place your overlay on is apparently the same size and is also stored uncompressed as pixel data in memory, meaning it's also at least 24bit per pixel (unless it's scaled and your overlay has its own independent resolution). If storing is your issue, maybe you could use a simple compression like run length encoding (depending on how complex your overlays will get).
f@dzhttp://festini.device-zero.de

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.