Jump to content

  • Log In with Google      Sign In   
  • Create Account


Like
0Likes
Dislike

In Memory Data Compression and Decompression

By Lee Millward | Published Dec 12 2005 08:19 AM in Game Programming

data file buffer source compressed compression function zlib size
If you find this article contains errors or problems rendering it unreadable (missing images or files, mangled code, improper text formatting, etc) please contact the editor so corrections can be made. Thank you for helping us improve this resource



Part One - Compression

This Sweet Snippet will show you how easy it can be to perform compression/decompression between data buffers in memory using the zlib library. We will go the easy route to get a simple example
application going which will read in the contents of a file into memory, compress that data, and then write it back out to file. In the second part we will use the output from part one to decompress
the data, then write it back out to disk so we can check the results.


What you will need:
- zlib 1.2.2


Zlib provides two different functions for in-memory buffer-to-buffer compression so let's have a look at them. Located in zlib.h at line 876 you will find the following declaration:



<span class="codekeyword">int</span> ZEXPORT compress(Bytef* dest, uLongf *destLen, <span class="codekeyword">const</span> Bytef *source,

                     uLong SourceLen);


dest - Pointer to the destination buffer where the compressed data will be written to.
destLen - After the function has returned, this value will be the size in bytes of the destination buffer.
source - Pointer to the source buffer which contains the data to be compressed.
sourceLen - Length of the source data in bytes.

This function is pretty simple to use - pass it pointers to two memory buffers, one containing the source data and one empty buffer for the compressed data. But what if you'd like a little more
control about exactly how the data is compressed? For that you would need to use the following function instead which is at line 891 in zlib.h:



<span class="codekeyword">int</span> ZEXPORT compress2(Bytef *dest, uLongf *destLen, <span class="codekeyword">const</span> Bytef *source,

                      uLong SourceLen, <span class="codekeyword">int</span> level);


The parameters are identical to compress except for the addition of a new one: level. The value of this parameter will determine how the data is compressed - allowing you to achieve
a trade-off between speed and compression ratio. The possible values are:


Z_NO_COMPRESSION - data is not compressed.
Z_BEST_SPEED - sacrifices compression ratio for improved speed.
Z_BEST_COMPRESSION - gain improved compression ratios but at a cost of execution speed.
Z_DEFAULT_COMPRESSION - this is a compromise between compression ratios and speed of execution.

Both of these functions will return Z_OK on success, otherwise an error code detailing a little more information about exactly why the call failed will be returned instead.


Now that you know what functions we can use, let's go through a simple example:



<span class="codecomment">//input and output files</span>

FILE *FileIn = fopen("FileIn.bmp", "rb");

FILE *FileOut = fopen("FileOut.dat", "wb");



<span class="codecomment">//get the file size of the input file</span>

fseek(FileIn, 0, SEEK_END);

<span class="codekeyword">unsigned long</span> FileInSize = ftell(FileIn);



<span class="codecomment">//buffers for the raw and compressed data</span>

<span class="codekeyword">void</span> *RawDataBuff = malloc(FileInSize);

<span class="codekeyword">void</span> *CompDataBuff = NULL;



<span class="codecomment">

//zlib states that the source buffer must be at least 0.1

//times larger than the source buffer plus 12 bytes

//to cope with the overhead of zlib data streams</span>

uLongf CompBuffSize = (uLongf)(FileInSize + (FileInSize * 0.1) + 12);

CompDataBuff = malloc((size_t)(CompBuffSize));



<span class="codecomment">//read in the contents of the file into the source buffer</span>

fseek(FileIn, 0, SEEK_SET);

fread(RawDataBuff, FileInSize, 1, FileIn);



<span class="codecomment">//now compress the data</span>

uLongf DestBuffSize;

compress2((Bytef*)CompDataBuff, (uLongf*)&DestBuffSize,

          (<span class="codekeyword">const</span> Bytef*)RawDataBuff, (uLongf)FileInSize, Z_BEST_COMPRESSION);



<span class="codecomment">//write the compressed data to disk</span>

fwrite(CompDataBuff, DestBuffSize, 1, FileOut);


I've not included any error checking in the above code for reasons of clarity; this is something you would obviously want to include in your own applications.


Part Two - Decompression

Having compressed data is of no use to anyone without a way of decompressing it back to the original form. Fortunately zlib provides the following utility function to decompress a data buffer in
memory:



<span class="codekeyword">int</span> uncompress(Bytef *dest, uLongf *destLen, <span class="codekeyword">const</span> Bytef *source, uLongf sourceLen);


dest - Pointer to the destination buffer where the decompressed data will be written to.
destLen - After the function has returned, this value will be the size in bytes of the decompressed data.
source - Pointer to the source buffer which contains the data to be decompressed.
sourceLen - Length of the compressed data buffer in bytes

Unlike its compression counterpart, there is only a single version of the decompression function since there is not much customisation you can apply to decompression - you generally want the
function to operate as fast as possible. The uncompress function returns the same set of values as its compression counterparts regarding success and failures.


Now let's move onto an example of how to use the above function to decompress the data from the file in part one before writing the original contents back out to disk:



<span class="codecomment">//the input file, this is the output file from part one</span>

FILE *FileIn = fopen("FileOut.dat", "rb");



<span class="codecomment">//output file</span>

FILE *FileOut = fopen("OrigFile.bmp", "wb");



<span class="codecomment">//get the file size of the input file</span>

fseek(FileIn, 0, SEEK_END);

<span class="codekeyword">unsigned long</span> FileInSize = ftell(FileIn);



<span class="codecomment">//buffers for the raw and uncompressed data</span>

<span class="codekeyword">void</span> *RawDataBuff = malloc(FileInSize);

<span class="codekeyword">void</span> *UnCompDataBuff = NULL;



<span class="codecomment">//read in the contents of the file into the source buffer</span>

fseek(FileIn, 0, SEEK_SET);

fread(RawDataBuff, FileInSize, 1, FileIn);

<span class="codecomment">//allocate a buffer big enough to hold the uncompressed data, we can cheat here

//because we know the file size of the original</span>

uLongf UnCompSize = 482000;

UnCompDataBuff = malloc(UnCompSize);



<span class="codecomment">

//all data we require is ready so compress it into the source buffer, the exact

//size will be stored in UnCompSize</span>

uncompress((Bytef*)UnCompDataBuff, &UnCompSize, (<span class="codekeyword">const</span> Bytef*)RawDataBuff, FileInSize);



<span class="codecomment">//write the decompressed data to disk</span>

fwrite(UnCompDataBuff, UnCompSize, 1, FileOut);


Again error checking has been removed for this example; we also use a fixed file size for the uncompressed data since we know how big the original file is. Ideally you would want to store the size
of the original uncompressed data along with the actual data itself for use when decompressing it.


That sums up compression between buffers in memory. The code for compression/decompression is ideally suitable for being as utility functions to hide away all those details of buffer
allocation/checking return values etc.








Comments

Note: Please offer only positive, constructive comments - we are looking to promote a positive atmosphere where collaboration is valued above all else.




PARTNERS