• 12/12/05 02:19 PM
    Sign in to follow this  

    In Memory Data Compression and Decompression

    General and Gameplay Programming

    Part One - Compression

    This Sweet Snippet will show you how easy it can be to perform compression/decompression between data buffers in memory using the zlib library. We will go the easy route to get a simple example application going which will read in the contents of a file into memory, compress that data, and then write it back out to file. In the second part we will use the output from part one to decompress the data, then write it back out to disk so we can check the results.

    What you will need:
    - zlib 1.2.2

    Zlib provides two different functions for in-memory buffer-to-buffer compression so let's have a look at them. Located in zlib.h at line 876 you will find the following declaration:

    int ZEXPORT compress(Bytef* dest, uLongf *destLen, const Bytef *source,
                         uLong SourceLen);
    
    dest - Pointer to the destination buffer where the compressed data will be written to.
    destLen - After the function has returned, this value will be the size in bytes of the destination buffer.
    source - Pointer to the source buffer which contains the data to be compressed.
    sourceLen - Length of the source data in bytes.

    This function is pretty simple to use - pass it pointers to two memory buffers, one containing the source data and one empty buffer for the compressed data. But what if you'd like a little more control about exactly how the data is compressed? For that you would need to use the following function instead which is at line 891 in zlib.h:

    int ZEXPORT compress2(Bytef *dest, uLongf *destLen, const Bytef *source,
                          uLong SourceLen, int level);
    

    The parameters are identical to compress except for the addition of a new one: level. The value of this parameter will determine how the data is compressed - allowing you to achieve a trade-off between speed and compression ratio. The possible values are:

    Z_NO_COMPRESSION - data is not compressed.
    Z_BEST_SPEED - sacrifices compression ratio for improved speed.
    Z_BEST_COMPRESSION - gain improved compression ratios but at a cost of execution speed.
    Z_DEFAULT_COMPRESSION - this is a compromise between compression ratios and speed of execution.

    Both of these functions will return Z_OK on success, otherwise an error code detailing a little more information about exactly why the call failed will be returned instead.

    Now that you know what functions we can use, let's go through a simple example:

    //input and output files
    FILE *FileIn = fopen("FileIn.bmp", "rb");
    FILE *FileOut = fopen("FileOut.dat", "wb");
    
    //get the file size of the input file
    fseek(FileIn, 0, SEEK_END);
    unsigned long FileInSize = ftell(FileIn);
    
    //buffers for the raw and compressed data
    void *RawDataBuff = malloc(FileInSize);
    void *CompDataBuff = NULL;
    
    
    //zlib states that the source buffer must be at least 0.1
    //times larger than the source buffer plus 12 bytes
    //to cope with the overhead of zlib data streams
    uLongf CompBuffSize = (uLongf)(FileInSize + (FileInSize * 0.1) + 12);
    CompDataBuff = malloc((size_t)(CompBuffSize));
    
    //read in the contents of the file into the source buffer
    fseek(FileIn, 0, SEEK_SET);
    fread(RawDataBuff, FileInSize, 1, FileIn);
    
    //now compress the data
    uLongf DestBuffSize;
    compress2((Bytef*)CompDataBuff, (uLongf*)&DestBuffSize,
              (const Bytef*)RawDataBuff, (uLongf)FileInSize, Z_BEST_COMPRESSION);
    
    //write the compressed data to disk
    fwrite(CompDataBuff, DestBuffSize, 1, FileOut);
    

    I've not included any error checking in the above code for reasons of clarity; this is something you would obviously want to include in your own applications.

    Part Two - Decompression

    Having compressed data is of no use to anyone without a way of decompressing it back to the original form. Fortunately zlib provides the following utility function to decompress a data buffer in memory:

    int uncompress(Bytef *dest, uLongf *destLen, const Bytef *source, uLongf sourceLen);
    
    dest - Pointer to the destination buffer where the decompressed data will be written to.
    destLen - After the function has returned, this value will be the size in bytes of the decompressed data.
    source - Pointer to the source buffer which contains the data to be decompressed.
    sourceLen - Length of the compressed data buffer in bytes

    Unlike its compression counterpart, there is only a single version of the decompression function since there is not much customisation you can apply to decompression - you generally want the function to operate as fast as possible. The uncompress function returns the same set of values as its compression counterparts regarding success and failures.

    Now let's move onto an example of how to use the above function to decompress the data from the file in part one before writing the original contents back out to disk:

    //the input file, this is the output file from part one
    FILE *FileIn = fopen("FileOut.dat", "rb");
    
    //output file
    FILE *FileOut = fopen("OrigFile.bmp", "wb");
    
    //get the file size of the input file
    fseek(FileIn, 0, SEEK_END);
    unsigned long FileInSize = ftell(FileIn);
    
    //buffers for the raw and uncompressed data
    void *RawDataBuff = malloc(FileInSize);
    void *UnCompDataBuff = NULL;
    
    //read in the contents of the file into the source buffer
    fseek(FileIn, 0, SEEK_SET);
    fread(RawDataBuff, FileInSize, 1, FileIn);
    //allocate a buffer big enough to hold the uncompressed data, we can cheat here
    //because we know the file size of the original
    uLongf UnCompSize = 482000;
    UnCompDataBuff = malloc(UnCompSize);
    
    
    //all data we require is ready so compress it into the source buffer, the exact
    //size will be stored in UnCompSize
    uncompress((Bytef*)UnCompDataBuff, &UnCompSize, (const Bytef*)RawDataBuff, FileInSize);
    
    //write the decompressed data to disk
    fwrite(UnCompDataBuff, UnCompSize, 1, FileOut);
    

    Again error checking has been removed for this example; we also use a fixed file size for the uncompressed data since we know how big the original file is. Ideally you would want to store the size of the original uncompressed data along with the actual data itself for use when decompressing it.

    That sums up compression between buffers in memory. The code for compression/decompression is ideally suitable for being as utility functions to hide away all those details of buffer allocation/checking return values etc.



      Report Article
    Sign in to follow this  


    User Feedback

    Create an account or sign in to leave a review

    You need to be a member in order to leave a review

    Create an account

    Sign up for a new account in our community. It's easy!

    Register a new account

    Sign in

    Already have an account? Sign in here.

    Sign In Now

    There are no reviews to display.