Reading PNG format (deflate compression)

Started by
14 comments, last by Lehm2000 9 years, 5 months ago

I'm writing my own image file loader as a personal exercise. I managed bitmap and targa just fine. I've moved onto png. I'm having a bit of trouble with the deflate compression. I've read through the spec( http://www.ietf.org/rfc/rfc1951.txt ). I understand the concepts involved but having a bit of trouble implementing them. I'll go through the steps I'm using to decode a test image and if someone could point out where I'm going wrong that would be great.

I created a test image (4x4px, solid (255,192,64). The compressed result in the IDAT chunk is ( xÚbüßàÀ..L.H.7....ÿÿ..Y..Çr.è3 )

Starting the parsing...

Bytes 0 and 1 are xÚ ( 0x78 and 0xDA ) which is the zlib info and exactly what they should be.

We move onto the actual image data.

Bytes 2 and 3 are bü ( 0x62 and 0xFC )

Deflate encodes using bits so the binary for those bytes are 0110 0010 1111 1100

The first bit ( 8th character) is 0 so that means this is not the final block. The next two bits ( 6th and 7th characters) are 01. That means that it uses fixed huffman code tree.

Read the first code. Start with 7bits as that is the smallest code in the fixed tree. 0001100 . Huffman codes are stored in reverse order so that becomes 0011000. That code is not in the fixed tree so we add one more bit. 00110001. That one is in the tree. The lit value is 1. My understanding is that anything less than 256 is the actual value. So the first byte of data for my image should be 1. Except there is no 1 value in the test image (should be 255). So either I'm getting off track somewhere or I misinterupting the output.

Any help is appreciated.

Advertisement

if someone could point out where I'm going wrong that would be great.

If you just want an image loader, why are you writing decoders? There are plenty of libraries out there that already do image decoding, and most of them are open-source... If you want to write your own, then your best bet is to look at their source code. You will also find many examples for DEFLATE.

I also tried to write my own decoder once, for JPG - I got as far as decoding the first 8x8 block of pixels in a non-progressive JPG file, but then Microsoft came up with WIC, and I just gave up on writing my own... WIC is much easier to use.

If you want to decode PNGs, you should be looking at the PNG spec, not the DEFLATE spec: http://www.libpng.org/pub/png/spec/1.2/PNG-Compression.html . There's also a link to example code at the bottom of the page. According to that page, PNG uses ZLIB, which adds some flags (I can't remember the size of the flags) at the start of the stream, followed by the actual DEFLATE data... You should skip the flags structure.

Also, are you sure the smallest code size in the fixed Huffman table is not 8 bits? (I don't know - just asking)


why are you writing decoders?

As a personal exercise he says.

Though even though implementing "everything" yourself might be fun, I think the writers of the PNG spec assume one would simply use zlib to do the compression/decompression when implementing a png image encoder/decoder.

The reason I'm doing this is to understand image file formats better. I figure there's no better way to learn about them than to decode them. If this was for an actual project I would just implement an existing library.

I've done this and it's a lot of fun :) Learning to implement specs like this can be a challenge - especially your first few.

Why don't you take your simple png file, and run it through something without restrictions such as stb_image and just trace what happens? Adjust your code according to what you discover :) Also don't forget that there is filtering on the png image which may give you different zlib output data - although it shouldn't touch the first pixel.

I think you would learn more about png image format (as compared to deflate compression) if you just use zlib and concentrate on the other bits of the format at first. Later you can always add on your own deflate decoder, after being able to decode all png chunks including all registered ancillary chunks.

Lehm2000 has already begun to understand deflate. If they want to explore it, and this is all for learning, just let them. It is what is currently interesting to them, and it's a good experience.

[edit] That isn't to anyone in particular - there are just multiple posts saying in one way or another to avoid this path, instead of helping with the actual question about deflate. However, this poster has demonstrated a certain amount of understanding with deflate, and they just need a little nudge to continue on.

By the way Lehm2000, the first byte of data on each line will be the filter type if you didn't figure that out by now. So, the "1" indicates a subtraction filter. I suspect your deflate decoding is okay so far.

I would guess that the way you're interpreting/reversing the huffman data is wrong. Are the huffman *bits* or *bytes* stored in reverse? I don't like the "added a bit to get something that is in the tree" nonsense.

I would guess that the way you're interpreting/reversing the huffman data is wrong. Are the huffman *bits* or *bytes* stored in reverse? I don't like the "added a bit to get something that is in the tree" nonsense.

The way Lehm2000 is reading the huffman codes is a pretty standard way to do it. So far it seems okay - the first byte marks the filter type for the current row of PNG data. In other words, the error does not appear to be the deflate implementation (at least not so far). The error appears to be the assumption that it should be 255, and not 1, because they forgot the need to reverse the PNG filter afterwards.

This topic is closed to new replies.

Advertisement