Jump to content
  • Advertisement
Sign in to follow this  
  • entries
  • comments
  • views

IKEA eat your heart out

Sign in to follow this  


So I'm rewriting my old Huffman encoder/decoder stuff to be (a) templated and (b) faster to decode. I've hit an interesting problem. It's a problem I hit last time and decided to ignore, but I'd like to find a way to solve it this time.

The output of my encoder is in two parts. For one, there's obviously the compressed data; I've got that sorted nicely now. It's almost breathtaking how quickly you can implement the tree construction using std::priority_queue.

The other is the tree used to decompress the data. I view it as a filter through which I can stream the compressed data - that's how I'm designing it to be used, for nice asyncronous loading/decompression of resources and so on. I'm trying to get the decompression very fast, and to keep the tree size down.

So, what I thought I'd do is this. Each non-leaf node would be X bytes (where X is as small as possible); that breaks down into two equal-sized fields, for left child and right child. The MSB on each field is used to indicate whether that child is a leaf, and the rest is used to store an offset to the child from the current read pointer. Leaf nodes are just stored as their values, and so will be as large as the templated type the user is encoding with.

The problem is with laying out nodes in the stream. The distance between any parent/child pair cannot be more than the number of bits I have available to store an offset in - given that I wanted to encode non-leaf nodes in 16-bit words that gives me 7 bits to store an offset in - 0-63 bytes of offset. So how do I order nodes in my stream such that the offsets are never too large?

I'm going to have to think about it for the next couple of days, but my initial thoughts:

  • It might be worth starting from the bottom and trying to collect nodes up into contiguous blocks of 62 bytes each
  • Consider how it differs from a list - if it were a regular list you could set up children at 63-byte intervals each with 63-byte offsets. Perhaps find the 'dominant' chain and do that with it, and send all other children into local zones
  • Another approach might be to place child nodes as far away from the parent as possible while still satisfying the offset condition - if the space is taken track back until you find space - and then 'vacuum-pack' the result

I've got my copy of Introduction to Algorithms open, I guess it's time I started trying to read it.
Sign in to follow this  


Recommended Comments

There are no comments to display.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!