deflate

Started by
6 comments, last by GameDev.net 18 years, 10 months ago
I'm implementing deflate according to this: http://www.faqs.org/rfcs/rfc1951.html, but I'm having a little problem: I have a correct deflate stream, and the first byte appears to be 237. 237 in binary is 11101101, and so the first 3 bits are 111. The first bit is BFINAL, and the second 2 bits are BTYPE. If BTYPE is 11, that means error! So according to this, something's not right, but it should be a correct deflate stream. So actually I'm wondering if maybe the first 3 bits are actually supposed to be the last 3 bits of this byte because they use the opposite bit order. Can someone confirm that I have to take the last 3 bits instead of the first 3?
Advertisement
Quote:we number the bits of a byte so that bit 0 is the least-significant bit


That implies that the first 3 bits would be here: xxxxx321
Chess is played by three people. Two people play the game; the third provides moral support for the pawns. The object of the game is to kill your opponent by flinging captured pieces at his head. Since the only piece that can be killed is a pawn, the two armies agree to meet in a pawn-infested area (or even a pawn shop) and kill as many pawns as possible in the crossfire. If the game goes on for an hour, one player may legally attempt to gouge out the other player's eyes with his King.
Yes, you take the bits from least significant (last bit in byte) to most significant (first bit in byte). I have a working inflate implementation I can post if you like (I think it's postable, don't think it's that big) so you can check things as you go if you get stuck.

EDIT: To be clear, a deflate stream is interpreted as a byte stream, with each byte read from least significant (last) bit to most significant (first) bit. Huffman codes are read msb to lsb (i.e. the first bit you read from the stream is the most significant bit of the code) and everything else (block type, code lengths, extra bits, anything else I've forgotten) is read lsb to msb (i.e. the first bit you read from the stream is the least significant bit of the value). This can be quite confusing, so here's an example. This byte stream contains BFINAL (true), BTYPE (fixed codes), length code 269, 2 extra bits to go with length code 269 (10), distance code 3. The byte stream (bytes in order, each byte displayed with msb first) looks like (x indicates a further bit in the stream, not covered in this example):

11000011 10001010 xxxxxxx1

Read as decimal bytes this would be 195, 130, some unknown but odd value.

This is interpreted as:

*******1 ******** xxxxxxx*
BFINAL = 1

*****01* ******** xxxxxxx*
BTYPE = 01

11000*** ******10 xxxxxxx*
Huffman length code = 0001101

******** ****10** xxxxxxx*
Extra bits = 10

******** 1000**** xxxxxxx1
Huffman distance code = 00011

Enigma

[Edited by - Enigma on May 16, 2005 7:40:04 AM]
Thanks for the help.

I'm having another confusion now:

is the distance of a <length, distance> pair always 5 bits, or is it another huffman code word, or something else? It never really sais that in the specification, not even in chapter 3.2.5. The only thing the specification sais is that it's always between 0 and 29, and comes behind a length code.

No matter which of the 2 posibilities I mentioned I tried, I sometimes get invalid distances, but it might of course also be due to something else in my code.
It's a Huffman code, but for the fixed code they all have a code length of 5.

Enigma
Quote:Original post by Enigma
It's a Huffman code, but for the fixed code they all have a code length of 5.

Enigma


Ah yes, I found it in the spec now:

Quote:
Distance codes 0-31 are represented by (fixed-length) 5-bit codes, with possible additional bits as shown in the table shown in Paragraph 3.2.5, above. Note that distance codes 30-31 will never actually occur in the compressed data.


You really have to read everything over and over 5 times before you get everything from such a spec :)

Do dynamic codes often occur in png files? I found no png with a dynamic code yet, but I tested only a few files.
Is it possible for values 286 and 287 to somehow occur in a png file? In the specification it sais they participate in the code tree, but are never used. However, while trying to decode a png file, I'm getting a value 287 for some reason, and I'm wondering if it's possible for such a value to occur and if it should be just skipped?

If not that, probably something is wrong with my code.
Hello everybody,

I am working with zip file and I would like to understand something. After the local header of the first file compressed in the zip I should find the header concerning the deflate information.

My problem concerns the deflate. I do not understand how I can calculate the length and the distance depending on the fact the the huffman is dynamic or constant. Somebody could give me an explanation of all the bits (before the compressed data) for the three BTYPE cases.

I read the rfc1951 but it is not very clear.

Thank you very much.

This topic is closed to new replies.

Advertisement