Jump to content

  • Log In with Google      Sign In   
  • Create Account

Reading PNG format (deflate compression)

  • You cannot reply to this topic
12 replies to this topic

#1 Lehm2000   Members   -  Reputation: 112

Like
0Likes
Like

Posted Yesterday, 06:55 PM

I'm writing my own image file loader as a personal exercise.  I managed bitmap and targa just fine.  I've moved onto png.  I'm having a bit of trouble with the deflate compression.  I've read through the spec( http://www.ietf.org/rfc/rfc1951.txt ).  I understand the concepts involved but having a bit of trouble implementing them.  I'll go through the steps I'm using to decode a test image and if someone could point out where I'm going wrong that would be great.

 

I created a test image (4x4px, solid (255,192,64).  The compressed result in the IDAT chunk is ( xÚbüßàÀ..L.H.7....ÿÿ..Y..Çr.è3 )

 

Starting the parsing...

 

Bytes 0 and 1 are xÚ ( 0x78 and 0xDA ) which is the zlib info and exactly what they should be.

 

We move onto the actual image data.

 

Bytes 2 and 3 are bü ( 0x62 and 0xFC )

 

Deflate encodes using bits so the binary for those bytes are 0110 0010 1111 1100

 

The first bit ( 8th character) is 0 so that means this is not the final block.  The next two bits ( 6th and 7th characters) are 01.  That means that it uses fixed huffman code tree.

 

Read the first code.  Start with 7bits as that is the smallest code in the fixed tree. 0001100  .  Huffman codes are stored in reverse order so that becomes 0011000.  That code is not in the fixed tree so we add one more bit. 00110001.  That one is in the tree.  The lit value is 1.  My understanding is that anything less than 256 is the actual value.  So the first byte of data for my image should be 1.  Except there is no 1 value in the test image (should be 255).  So either I'm getting off track somewhere or I misinterupting the output. 

 

Any help is appreciated.

 

 

 

 



Sponsor:

#2 tonemgub   Members   -  Reputation: 1127

Like
0Likes
Like

Posted Today, 02:02 AM


if someone could point out where I'm going wrong that would be great.

If you just want an image loader, why are you writing decoders? There are plenty of libraries out there that already do image decoding, and most of them are open-source... If you want to write your own, then your best bet is to look at their source code. You will also find many examples for DEFLATE.

I also tried to write my own decoder once, for JPG - I got as far as decoding the first 8x8 block of pixels in a non-progressive JPG file, but then Microsoft came up with WIC, and I just gave up on writing my own... WIC is much easier to use.

 

If you want to decode PNGs, you should be looking at the PNG spec, not the DEFLATE spec: http://www.libpng.org/pub/png/spec/1.2/PNG-Compression.html . There's also a link to example code at the bottom of the page. According to that page, PNG uses ZLIB, which adds some flags (I can't remember the size of the flags) at the start of the stream, followed by the actual DEFLATE data... You should skip the flags structure.

 

 

Also, are you sure the smallest code size in the fixed Huffman table is not 8 bits? (I don't know - just asking)


Edited by tonemgub, Today, 02:09 AM.


#3 Olof Hedman   Crossbones+   -  Reputation: 2823

Like
0Likes
Like

Posted Today, 02:26 AM


why are you writing decoders?

 

As a personal exercise he says.

 

Though even though implementing "everything" yourself might be fun, I think the writers of the PNG spec assume one would simply use zlib to do the compression/decompression  when implementing a png image encoder/decoder.


Edited by Olof Hedman, Today, 02:28 AM.


#4 Lehm2000   Members   -  Reputation: 112

Like
2Likes
Like

Posted Today, 05:03 AM

The reason I'm doing this is to understand image file formats better.  I figure there's no better way to learn about them than to decode them.  If this was for an actual project I would just implement an existing library.



#5 achild   Crossbones+   -  Reputation: 1810

Like
2Likes
Like

Posted Today, 06:07 AM

I've done this and it's a lot of fun :) Learning to implement specs like this can be a challenge - especially your first few.

 

Why don't you take your simple png file, and run it through something without restrictions such as stb_image and just trace what happens? Adjust your code according to what you discover :) Also don't forget that there is filtering on the png image which may give you different zlib output data - although it shouldn't touch the first pixel.



#6 wintertime   Members   -  Reputation: 1708

Like
1Likes
Like

Posted Today, 06:38 AM

I think you would learn more about png image format (as compared to deflate compression) if you just use zlib and concentrate on the other bits of the format at first. Later you can always add on your own deflate decoder, after being able to decode all png chunks including all registered ancillary chunks.



#7 achild   Crossbones+   -  Reputation: 1810

Like
2Likes
Like

Posted Today, 08:27 AM

Lehm2000 has already begun to understand deflate. If they want to explore it, and this is all for learning, just let them. It is what is currently interesting to them, and it's a good experience.

 

[edit] That isn't to anyone in particular - there are just multiple posts saying in one way or another to avoid this path, instead of helping with the actual question about deflate. However, this poster has demonstrated a certain amount of understanding with deflate, and they just need a little nudge to continue on.


Edited by achild, Today, 08:31 AM.


#8 achild   Crossbones+   -  Reputation: 1810

Like
1Likes
Like

Posted Today, 08:35 AM

By the way Lehm2000, the first byte of data on each line will be the filter type if you didn't figure that out by now. So, the "1" indicates a subtraction filter. I suspect your deflate decoding is okay so far.



#9 radioteeth   Prime Members   -  Reputation: 1049

Like
0Likes
Like

Posted Today, 11:41 AM

I would guess that the way you're interpreting/reversing the huffman data is wrong. Are the huffman *bits* or *bytes* stored in reverse? I don't like the "added a bit to get something that is in the tree" nonsense.



#10 achild   Crossbones+   -  Reputation: 1810

Like
0Likes
Like

Posted Today, 11:47 AM

I would guess that the way you're interpreting/reversing the huffman data is wrong. Are the huffman *bits* or *bytes* stored in reverse? I don't like the "added a bit to get something that is in the tree" nonsense.

 

The way Lehm2000 is reading the huffman codes is a pretty standard way to do it. So far it seems okay - the first byte marks the filter type for the current row of PNG data. In other words, the error does not appear to be the deflate implementation (at least not so far). The error appears to be the assumption that it should be 255, and not 1, because they forgot the need to reverse the PNG filter afterwards.


Edited by achild, Today, 11:51 AM.


#11 Nypyren   Crossbones+   -  Reputation: 4283

Like
1Likes
Like

Posted Today, 01:35 PM

Yes, like others say, "inside" the ZLIB stream is a PNG-filtered representation. The PNG filtered representation is an optimization step that PNG performs on the color data in order for DEFLATE to compress better than it would on the raw color data.
 
IDAT chunk { ZLIB { PNG Filters (each scanline starts with the filter ID that the scanline is using) { Color data } } }

2. Filter the image data according to the filtering method specified by the IHDR chunk. (Note that with filter method 0, the only one currently defined, this implies prepending a filter-type byte to each scanline.)


And the logic for encoding/decoding with the filters is here:

http://www.libpng.org/pub/png/spec/1.2/PNG-Filters.html

Filtering algorithms are applied to bytes, not to pixels, regardless of the bit depth or color type of the image. The filtering algorithms work on the byte sequence formed by a scanline that has been represented as described in Image layout. If the image includes an alpha channel, the alpha data is filtered in the same way as the image data."

For all filters, the bytes 'to the left of' the first pixel in a scanline must be treated as being zero. For filters that refer to the prior scanline, the entire prior scanline must be treated as being zeroes for the first scanline of an image (or of a pass of an interlaced image).


Unsigned arithmetic modulo 256 is used, so that both the inputs and outputs fit into bytes.


In other words, if you expect your final image data to start with the byte '255', and you have a scanline filter of 1 (sub), then your next byte should be a 1. (0 minus 1 -> modulo 256 -> 255)

Edited by Nypyren, Today, 01:40 PM.


#12 achild   Crossbones+   -  Reputation: 1810

Like
0Likes
Like

Posted Today, 02:01 PM

[deleted]

 

Sorry - hold on. I made the wrong conclusion from some old code that didn't match with the spec (the result was correct but my explanation in this post was not). Fixed it in the below post (it was supposed to be another edit but I messed that up too).


Edited by achild, Today, 02:09 PM.


#13 achild   Crossbones+   -  Reputation: 1810

Like
1Likes
Like

Posted Today, 02:07 PM


In other words, if you expect your final image data to start with the byte '255', and you have a scanline filter of 1 (sub), then your next byte should be a 1. (0 minus 1 -> modulo 256 -> 255)

 

Not quite.

 

For the first byte, you can just write the raw value. The equation is (curr - prev), and x < 0 is defined as 0. So filtering with the PNG subtraction filter would give 255 - 0, mod 256, which is still 255. So byte 1 would be "1" for sub filter, and byte 2 would be "255".


Edited by achild, Today, 02:09 PM.






PARTNERS