What's the next step after LZ77?

Started by
6 comments, last by alvaro 9 years, 9 months ago

The first compression algorithm I learned to use was RLE. Then I learned LZ77 and am currently using it in my program. However, it's rather slow and not that efficient. Check out a sample compression comparison of an actual file that my program would need to compress:

Original filesize: 15.2MB

My LZ77 algorithm:

Filesize: 4.5MB (29.49% compression)

Time: 9.24 seconds

WinRAR:
Filesize: 905KB (5.79% compression)

Time: an instant

I could definitely use an improvement like that. What are my options?

Advertisement
What do your files typically contain?

LZMA is an improved LZ77 algorithm, you could try using it. I think on average it compresses better than WinRAR.

It's in public domain so you don't need any licenses (I'm not an expert) and could use SDK:

http://www.7-zip.org/sdk.html

If you're looking for speed, I recommend LZ4 (http://en.wikipedia.org/wiki/LZ4_(compression_algorithm)), if you're looking for high compression LZMA is pretty much the best freely available algorithm (as stated above). If you want a good trade off between those, zlib is still pretty competitive.

If WinRAR is faster than your LZ77 implementation, then your LZ77 implementation is very very slow.

Like just said before: Take a look at LZ4 - it's used by Frostbite on PS4 and XBox One:

https://code.google.com/p/lz4/

https://twitter.com/JonOlick/status/466082222420664321

"LZ4 is a very fast lossless compression algorithm, providing compression speed at 400 MB/s per core, scalable with multi-cores CPU. It also features an extremely fast decoder, with speed in multiple GB/s per core, typically reaching RAM speed limits on multi-core systems."

Original filesize: 15.2MB
Filesize: 4.5MB (29.49% compression)
Filesize: 905KB (5.79% compression)


Your math completely fails to add up.

Also, you should profile and optimize your LZ77 implementation. 9 seconds to compress 15 MB is pretty horrific... assuming you're not running on a 386 :-P

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

Welp, excuse my bad timings. Right after I posted this thread I went back to my algorithm (which I wrote 5 months ago) and re-wrote it, I figured out a way to make it faster. It now compresses at 26.55% in only 1.37 seconds. Much better, but I still want more.

LZMA is an improved LZ77 algorithm, you could try using it. I think on average it compresses better than WinRAR.

It's in public domain so you don't need any licenses (I'm not an expert) and could use SDK:

http://www.7-zip.org/sdk.html

I'll definitely take a look at that! Thanks!

What do your files typically contain?

Commands to re-create a simulated environment. I plan on optimizing the data itself for size eventually, but the program can potentially run forever (generating infinite amounts of records) so compression still has to be as good as possible.

Compression only works because there is structure in the input. A general compression scheme has to exploit things like the presence of repetition, which is common in many formats. If you look carefully at what the input actually is, you might be able to come up with a much better compression scheme.

So I'll try to ask a different way: Can you show us some examples of what your files contain?

This topic is closed to new replies.

Advertisement