Jump to content
  • Advertisement
Sign in to follow this  
Komal Shashank

How would I convert a number into binary bits, then truncate or enlarge their size, and then insert into a bit container?

This topic is 1163 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi... I have tried to do this again and again but I'm failing to arrive at a solution. So I am posting here. Please bear with me. I want to take a number (declared preferably as int or char or std::uint8_t), convert it into its binary representation, then truncate or pad it by a certain variable number of bits given, and then insert it into a bit container (preferably std::vector<bool> because I need variable bit container size as per the variable number of bits). For example, I have int a= 2, b = 3. And let's say I have to write this as three bits and six bits respectively into the container. So I have to put 010 and 000011 into the bit container. So, how would I go from 2 to 010 or 3 to 000011 using normal STL methods? I tried every possible thing that came to my mind, but I got nothing. Please help. Thank you.

Share this post


Link to post
Share on other sites
Advertisement
You can use the >>, <<, &, |, and ~ operators to do bitwise manipulation on integers.

Beware: The >> operator can* do arithmetic shifting if the type is signed (i.e. when the integer is signed, the highest bit is not modified when you shift).

Since this sounds a lot like homework I'm not sure if I can give any more hints than this.


*edit: Apparently ASR/LSR is implementation defined... making this even more of a pain in the ass than usual. Edited by Nypyren

Share this post


Link to post
Share on other sites

This is not homework. I'm trying to generate canonical Huffman codes as defined in http://en.wikipedia.org/wiki/Canonical_Huffman_code.

 

I did this logic on paper so I'll write it here. Let's say I have Huffman symbols and bit lengths (in order),

A	3
K	3
H	4
S	4
L	5
M	5
N	5
O	5

Since the pseudo code given in the link does some addition operations, I am using integer to assign code = 0. So, the final canonical codes will be,

A	0
K	1
H	4
S	5
L	12
M	13
N	14
O	15

Now, I want to convert these numbers into binary having their corresponding bit lengths as above so that I can use them to encode my file. So, these will become,

A	000
K	001
H	0100
S	0101
L	01100
M	01101
N	01110
O	01111

As you can see, the numbers from H - O do not need the first 0 at MSB to be represented. But it is required for correctly encoding the file and this is where I'm stuck. I hope I made myself clear. If anyone can think of a solution (or alternative), I would really appreciate it. Thank you very much.

Edited by WDRKKS

Share this post


Link to post
Share on other sites

If you want to work with what C++ has to offer then std::bitset is probably the best way to do it. It has a contructor that takes an unsigned long that extracts the individual bits for you and once you've created the bitset object then you can just use operator[] to access the bits you want.

Share this post


Link to post
Share on other sites

You can use the >>, <<, &, |, and ~ operators to do bitwise manipulation on integers.


This.
If you don't want to use std::bitset, then you'll need to manually manage buffer your bits into bytes-- there is no way to read or write anything smaller than a byte from or to a stream:

struct foobar {
    unsigned buffer;
    int offset = sizeof(unsigned)*8 - 1;

    ostream out;
}

void foobar::add_bit(bool b) {
    this->buffer |= (b?1:0) << this->offset;

    if ( this->offset == 0 ) {
        this->write(this->buffer);

        this->offset = sizeof(this->buffer)*8 - 1;
        this->buffer = 0;
    } else
        this->offset--;
   
}

Share this post


Link to post
Share on other sites

I tried using std::bitset. However, if I'm iterating through the list of symbols, I am not able change the size of the bitset every iteration to match the symbol's corresponding bit length since bitset must have a constant value.

for(every symbol, bit length)
{
	std::bitset<bit length> Bits(code);
}

Error: expression must have constant value.

Is there any alternative to this?

 

EDIT: fastcall22, that is not my problem. I can write bits to the file fine. I want to put these bits of variable sizes into a list (or map) of vector<bool> for each symbol so that I can use that to encode the corresponding symbol to the file.

Edited by WDRKKS

Share this post


Link to post
Share on other sites

convert it into its binary representation

 

Also, this sounds a bit weird, because computers store numbers in binary. Your int/char/whatever ALREADY is in its binary representation (because thats what the computer uses).

 

So, if you want the first three bits of an int, you just read the 3 first bits. No need to 'convert to binary' or anything.

 

As in:

int number=5;

int bits=3; //how many bits you want to read from number

std::vector<bool> bits;

for (int i=0; i<3; ++i) //bits stored in order 1st, 2nd, 3rd. Iterate in other direction for opposite order.

{

bool bit = static_cast<bool>((number>>i) & 1); //shift i bits to the right, then read the first bit (eg shift 2 to right, and the first bit will be the 3rd one)

bits.push_back(bit); //push it in the container

}

Edited by Waterlimon

Share this post


Link to post
Share on other sites

Also, this sounds a bit weird, because computers store numbers in binary. Your int/char/whatever ALREADY is in its binary representation (because thats what the computer uses).

 

Well, I thought so too... But when I write the number itself to the file, it is writing the ASCII value to the file. I assumed it might do the same thing even when I store it in a bit container. So, I thought I might need to do a conversion or something. For writing to the file, I figured out the way of packing bits into bytes to write the exact number. However, this I wasn't sure of. Thanks for the code, though. I'll check it out and see how it goes.

Edited by WDRKKS

Share this post


Link to post
Share on other sites

Well, I thought so too... But when I write the number itself to the file, it is writing the ASCII value to the file. I assumed it might do the same thing even when I store it in a bit container


There's a difference between stream << uint32 and stream.write(reinterpret_cast<const char*>(&uint32),sizeof(uint32)), regardless of whether or not you open a stream in binary or text mode.

In text mode, new lines are translated to a unified format: Any combination of 0x0D and 0x0A (windows, mac, and unix line endings) are translated to just 0x0A (IIRC).
In binary mode, there are no newline translations: "\r\n" remains as 0x0D 0x0A.

If you actually want to read and write binary data, then use stream.write/read:
int main() {
    unsigned short val = 1234; // (0x04D2)

    ofstream fs("out",ios::binary);

    fs << val << '\n';
    fs.write(reinterpret_cast<const char*>(&val),sizeof(val));
}

/* out contains (on a little-endian machine):
HEX                     ASCII
31 32 33 34 0A D2 04    1234.Ò.
*/
Edited by fastcall22

Share this post


Link to post
Share on other sites

Well, I thought so too... But when I write the number itself to the file, it is writing the ASCII value to the file.


Sounds like you're writing a uint8_t, not realizing that in C/C++ that's a typedef for a char, and ostream treats a char as a, well, char (and hence interprets the number as a character in the current locale, usually resulting in ASCII output).

You can use the .write() method to output a specific value as raw bytes.

Now, I want to convert these numbers into binary having their corresponding bit lengths as above so that I can use them to encode my file. So, these will become,


C++ doesn't have any built-in mechanism for dealing with that. You can store your bit pattern in an uint8_t and then also store the number of bits in each pattern separately (e.g. as a pair<uint8_t, uint8_t> or a simple struct) and then use that for all the manual shifting you'll need to do.

There are libraries like Boost's dynamic_bitset that might come in handy, too.

But for the most part, this kind of low-level bit manipulation is (surprisingly) not something that C or C++ really excel at or make particularly easy.

As a note, keep in mind that you're often better off working with large integer types than char (like a regular int or long) when doing lots of this kind of stuff as it makes better use of the CPU's registers and instructions. Only work with 8-bit types when you really need to conserve memory usage; work with native machine integers everywhere you can.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!