• Advertisement
Sign in to follow this  

Huffman Coding: Decoding via Bitshifts...?

This topic is 4678 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, I have a question regarding the decoding of a huffman coded file and the bitshifts required... If I read in the file one character at a time I was thinking it would be possible to shift bits into a temporary character and then look for the correct code but I am unsure as to shift one bit at a time from one character to another...?

Share this post


Link to post
Share on other sites
Advertisement
you can always do something like this:


temp_byte = buffer [bit_index >> 3];
temp_bit = (temp_byte >> (bit_index & 0x7)) & 1;



then you just keep incrementing bit_index. Of course, reloading the temp_byte each time might slow down your algorithm, but doing an "if (!(bit_index & 0x7))" is probably even slower.

Share this post


Link to post
Share on other sites
Can you explain that a little more please? I am still very new to all this stuff and that code just kind of went over my head a couple miles up.

thanks,
john

Share this post


Link to post
Share on other sites
sure! (BTW, I'm assuming you're using C/C++)

step 1: read all of the compressed data into a buffer (where buffer is an array of unsigned char)

to find any desired bit (i.e. bit_index) you do the following:

step 2: isolate the byte in which that bit is located, ergo:
"temp_byte = buffer [bit_index >> 3];"
note that >>3 means divide by 8, (look up bit-shifting) so you could say:
"temp_byte = buffer [bit_index / 8];"
this is because there are 8 bits in a byte, so dividing by 8 and rounding down gives you the byte that houses the bit at "bit_index"

step 3: grab the sellected bit from that byte, ergo:
"temp_bit = (temp_byte >> (bit_index & 0x7)) & 1;"
there are 3 parts here:
a) bit_index & 0x7 gets the remainder of bit_index / 8. You could also use modulo (in C/C++ it is the "%" character) for any generic number, but since 8 is a power of two, we can use "& (binary)111" instead of "% 8", as it's faster
b) temp_byte >> X means shift the bits right by X (where x is our bit remainder from step 3a). if X=0, there is no shifting, if X=0 it shifts all the bits by one, so the 2-bit moves into the 1-bit place, the 4-bit into the 2-bit place, etc.
c) Y & 1 simply grabs the least significant bit of the number Y. in the previous step, we made sure that the bit of interest was acually in that location

so to get the 1st bit in the whole buffer, set bit_index to 0, and call the code. to find the 1000th bit, set bit_index to 999, and call the code. You should also do error checking to make sure you never read more than was put into the buffer.

hope it helps

Share this post


Link to post
Share on other sites
You need to access bits one at a time, so you read in the file to a temporary char, or int, or whatever, char is easiest. Then you acess each bit at a time:



for(int i=0;i<howeverBigTheFileIs;i++)
{
char temp_char;
temp_char = readInAcharFromFile(whateverFile);

for(int j=0;j<sizeof(char)*8;j++)
{
doWhatYouNeedToDoWithTheMostRecentBit(temp_char & (0x01 << j));
}
}








And doing lonestar's way by reading from a char buffer, I would use "buffer[bit_index / 8]" not "buffer[bit_index >> 3]" since the compiler will perform the optimization to do the divide faster with bit shifts, and it looks alot clearer when you see that you are dividing by 8 to go from bit_index to byte_index.

[Edited by - SaintGreg on March 29, 2005 4:29:59 PM]

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement