Converting data to binary

Started by
9 comments, last by h0-0t 21 years, 3 months ago
Hearing about them is enough. Googling for "LZ77+tutorial" or "LZ77+algorithm" is the next step...

FWIW I don''t think RLE on the binary data is going to get you much compression. I think even the simplest dictionary type scheme would beat it (...maybe...)


For example if you knew you only ever required 7 bit ASCII (i.e. Europe didn''t exist and you only ever need to compress text) then you have ~127 unused single byte codes available for storing interesting stuff.

For example the word "THE" comes up quite a lot, it''s 3 characters, and often has a space to the left too so up to 4 characters. What if you assigned ASCII character 178 to mean "THE", and simply replace all the "THE"''s with that single character.

To find the words which will get the best compression, make a list of all the **UNIQUE** words which appear in the text to be compressed and for each one, a count of how many times it appears. Then just sort the list by count. If you have a tiebreak situation choose the longest word first. The sorted result is then your "dictionary" which you assign the unused ASCII codes to.

500 500 500

--
Simon O''Connor
Creative Asylum Ltd
www.creative-asylum.com

Simon O'Connor | Technical Director (Newcastle) Lockwood Publishing | LinkedIn | Personal site

This topic is closed to new replies.

Advertisement