Archived

This topic is now archived and is closed to further replies.

Dark Star

Need good info on Huffman coding

Recommended Posts

Hi good any of you kind folks point me to a good source of information about how the huffman algorithm or coding is done. I am doing a 5 minute presentation in 2 weeks I am to give a simplified example of it. any help is good help Dark Star UK

Share this post


Link to post
Share on other sites
Pop down to your local library and see if they have "The Data Compression Book" - that has an easy to understand explanation (and source if you need it).

Also take a look at the Doctor Dobbs Journal compression pages at: http://www.ddj.com/topics/compression/

and it''s probably also worth taking a look at the FAQs for the comp.compression and related usenet groups (www.faqs.org)

--
Simon O''''Connor
Creative Asylum Ltd
www.creative-asylum.com

Share this post


Link to post
Share on other sites
Essentially how Huffman compression works is this...

1. Scan the source bytes and generate a list of the most commonly occuring bytes. Essentially make an array of int''s (0-255)and iterate through the data you want to compress adding 1 to the index determined by the value of each byte in the source data.

2. Sort the Array.

3. This is the tricky part so i''m going to summarize. Build a binary tree such that the most freequently occuring byte values are leaf nodes near the root of the tree and the least occuring byte values are assigned to leaves deep in the tree. Only leaf nodes should contain values.

4. Iterate through your sorce data. For each byte in the source data try to find it''s value in the binary tree. For each left child node you traverse add a 0 bit to your compressed data. For every right node you traverse add a 1 bit. Because of how the tree was built every byte value from the source will have a unique bit sequence in the compressed data. Not only will it be unique but it will also have a unique sequence. For example value 255 has a bit sequence of 01 no other byte value can contain 01 as the first two bits in it''s bit sequence.

5. To decompress you merely read out bits from the compressed data traversing down the tree until you arrive at a leaf node containing a value. Add the Value to the uncompressed data and then begin at the root of the tree again... continuing reading the bits out of the compressed data.

Breaking down huffman compression into 5 steps is a big simplification. But that''s the gist of it.

You may be intrested to know that Q3A uses per packet huffman compression to keep packet sizes down.





Share this post


Link to post
Share on other sites