Jump to content
  • Advertisement
Sign in to follow this  
implicit

LZ algorithm improvements

This topic is 3622 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm currently writing a classic Lempel-Ziv packer for an embedded application, one where decompression speed and simplicity as well as compression rates are far more important than compression time. Basically it's your standard sliding window dictionary coder, with a variable length encoding of match/literal sequence lengths and offsets, and I'm looking to squeeze some bytes out of it without complicating the decompressor. Also while I wouldn't say that compression time is entirely irrelevant I'd still be perfectly happy with packing a 64k file being an overnight batch job on a modern computer. Anyway the few implementations I've seen and read about so far have either been greedy algorithms, just continually trying to match as long a sequences as possible with an escape route for literals, or fast hashing schemes where only the last or last few matches are even considered. However this is suboptimal since it's entirely possible that applying a shorter match right now might open up for longer (or nearer) matches ahead, similarly deciding whether or not it's worth interrupting a literal sequence also needs some lookahead. I must admit that I really haven't read up on the subject, I'm just hoping that someone around here can suggest some reasonable methods which aren't too hard to implement so I don't have to dig through the scary-looking research papers which turned up on Google.. ;)

Share this post


Link to post
Share on other sites
Advertisement
Which one are you writing? LZ77, LZ78, LZW, LZMA, LZRW1KH ...?
I've written LZW with bit-phased packing which reduces the size of the codewords in the file by up to 1 bit each time over normal LZW. You can also add two strings to the dictionary upon each match instead of just 1, which increases compression a little, or so I've read.

Are you wanting to modify what you've got or are you happy to try something else premade that fits your criteria?

Share this post


Link to post
Share on other sites
Quote:
Original post by iMalc
Which one are you writing? LZ77, LZ78, LZW, LZMA, LZRW1KH ...?
I've written LZW with bit-phased packing which reduces the size of the codewords in the file by up to 1 bit each time over normal LZW. You can also add two strings to the dictionary upon each match instead of just 1, which increases compression a little, or so I've read.
I don't really think I can afford the RAM to store a separate dictionary so I'm going for a LZ77 sliding-window scheme instead. No fancy statistical analysis or prediction, just plain old Lempel-Ziv.
That is it emits a sequence of references to earlier data in the window (consisting of a length and an offset), and runs of literal bytes (with only the length). Currently I'm differentiating between the two classes with a bit flag and writing the lengths and offsets through elias gamma coding. However I intend to experiment with other similar bit-packing schemes, and I'd certainly be happy to hear any ideas for alternatives.

The details aren't that important, the point is that deciding just what references to use for best compression is non-trivial for any sliding-window algorithm. So I'm looking for some reasonable heuristic that's better than the old greedy algorithm, which just scans through the file from start to finish and selects the longest possible match at each point. At least I'm assuming that there must be something better out there but I can't think of anything intelligent myself besides brute-force and randomization schemes.

I'd be happy with a library too for that matter but I've got to take some special considerations into account, like not allowing matches spanning the window "edge", for performance reasons. The decoder has to run on a quirky 8-bit platform, fit within two or three hundred bytes, and decode at least one byte per hundred cycles from my dataset, so there's little chance of just plugging in a finished C implementation and hoping it'll work.

Share this post


Link to post
Share on other sites
Elias gamma coding was an interesting read thanks. I have toyed with the idea before of using a unary number to dictate the length of arbitrarily long binary data and now have a name to the idea.
In your case it's effectiveness depends on how big your sliding window is. You don't actually need to allow for completely arbitrary length and offset values, just offsets that can go as far back as your sliding window.

If you have maybe 8K of RAM to spare for a match table then LZRW1KH could work well for you. It's very fast.

Share this post


Link to post
Share on other sites
Quote:
Currently I'm differentiating between the two classes with a bit flag and writing the lengths and offsets through elias gamma coding. However I intend to experiment with other similar bit-packing schemes, and I'd certainly be happy to hear any ideas for alternatives.

Exp-Golomb codes might afford a slight improvement because they allow a minimum length that need not be included in the base1 string.
The premise of these and similar universal codes is that the input distribution has such a long tail that basically infinite numbers need to be representable. In the few cases this has come up, I've seen better results from a fixed-length quantized length field (indexes into a LUT) followed by a variable amount of payload bits.

Share this post


Link to post
Share on other sites
Quote:
Original post by iMalc
In your case it's effectiveness depends on how big your sliding window is. You don't actually need to allow for completely arbitrary length and offset values, just offsets that can go as far back as your sliding window.
Yes, you can easily limit it by simply not looking for new bits beyond the size of the window. The length isn't a problem (e.g. loosing a bit on a 256-byte match or literal sequence won't hurt you) but similar LZ encoders with other offset coders seem to perform better. Possibly something with two or three prefix bits indicating the overall length of the offset followed by the the actual offset itself (I'm going to have to experiment here..).

Quote:
If you have maybe 8K of RAM to spare for a match table then LZRW1KH could work well for you. It's very fast.
I've currently budgeted for a 3k sliding window, and even that's a major concession on a 64k machine. The big advantage of a sliding window here is that it allows me to mostly waste the cycles on decompression when the game is idle, then consume the buffer during intense scenes, something which wouldn't have been possible with a fixed dictionary.

To be fair I'm already using a great packer for the non-streaming data, however it's fairly slow so was hoping to use a simplified encoding and compensate by spending more effort to find better matches during compression. I currently have a some trivial code to decide whether to allow a backwards reference depending on how many bits it would consume and whether it will interrupt a literal sequence, and by tweaking the scoring factors I can gain a few bytes but there's got to be something better I can do.

Quote:
Original post by Jan Wassenberg
Quote:
Currently I'm differentiating between the two classes with a bit flag and writing the lengths and offsets through elias gamma coding. However I intend to experiment with other similar bit-packing schemes, and I'd certainly be happy to hear any ideas for alternatives.
In the few cases this has come up, I've seen better results from a fixed-length quantized length field (indexes into a LUT) followed by a variable amount of payload bits.
You know I was just thinking to try that very thing, perhaps include the length table in the compressed file and experiment with many alternatives to see what best fits the data during compression. I've just got to debug my current implementation first.. ;)
Any other ideas?

Share this post


Link to post
Share on other sites
Okay.. I've got a basic version (apparently) working now, the compression rate is terrible but at least it's a baseline for what I can expect in regards to performance and size.

I've also snapped up some useful ideas from another implementation. For instance one literal run can't follow another so there's no need for an indicator bit after a literal run, also as far as I can tell single byte matches aren't worth it and the savings in run lengths and joined literals from disabling them is surprisingly large.
As expected using a small prefix to determine the length of the offset saves a bit of space, especially if you tweak the length table to fit the file. Another neat observation is that long offsets simply aren't worth using for short (e.g. two-byte) references, so using a different offset length tables depending on the length of a run saves even more space.

Furthermore Charles Bloom has an interesting page with implementations of various encoder improvements. Such as working out every possible match within the file and sorting the results by length and offset, then applying them in that order (with some fudging to avoid splitting literal runs).

Share this post


Link to post
Share on other sites
Just a thought, but before actually deciding to use a match, are you checking that it wouldn't actually be shorter to NOT use the match? E.g. a match with data of 3000 bytes ago of length two takes: 23 + 3 = 26 bits, whereas simply copying those two bytes costs only 16 bits.

Share this post


Link to post
Share on other sites
Quote:
Original post by iMalc
Just a thought, but before actually deciding to use a match, are you checking that it wouldn't actually be shorter to NOT use the match? E.g. a match with data of 3000 bytes ago of length two takes: 23 + 3 = 26 bits, whereas simply copying those two bytes costs only 16 bits.
Sort of. I use a separate offset table for two-byte matches, meaning that any possible match is a net win. There's also a slight fudge-factor which increases the cost if we were already in a literal run, but that check is less than perfect.

Further searching reveals that two alternatives to the classic greedy method are called Lazy Matching and "Optimal" Parsing. It remains to be seen whether they'll help much however.
I'm also looking into adding some search optimizations (hashing and whatnot) since if I can get the compression time down to something more reasonable then it becomes feasible to simply combinatorially test every possible offset length table.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!