Sign in to follow this  

image compression question

This topic is 4775 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

An application that I'm currently working on will have the need for many small images to be loaded and memory is a large concern. What can anyone recommend for image formats. I've been looking at wavelet compression, but from what I understand it isn't a completed standard (jpeg 2000). Is jpeg and gif the way to go?

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
If memory is a concern then jpeg/gif/.. won't help you much, since they save space on disk, not on memory. You still have to uncompress them before use. And don't even thing about uncompressing them on-fly. The only viable option I can think of right now is DXTC (DDS) format. It's used by graphic cards to save memory and can be uncompressed very fast.

Share this post


Link to post
Share on other sites
Quote:
Original post by toxy
Is jpeg and gif the way to go?


No it's the way not to go, jpegs work badly for games and gif i generaly wonky, tga and png is pretty ok, DDS works if you want image compression in the texture memory.
I don't think your memory problems is as mutch a concern as it is for the PS2 in where you have about 750 kb for texture memory.
remember, if you are using recent harware, you could use NPOT textures and make them smaller to save in on texture memory.

Share this post


Link to post
Share on other sites
I've used a heightmap of earth in my planet engine. It's 43200x21600x16 Bit, so it's about 1GB uncompressed. I've implemented a wavelet compression scheme (more precisely, first a 5.3 wavelet transform, then EZT coding and finally Huffman coding). That's more or less JPEG2000. For Huffman coding, I use ZLib. The EZT is quite straightforward to implement and the wavelet transform I looked up in a textbook. The compressed heightmap is 20MB (50:1 compression ratio). You can select an arbitrary compression ratio at the cost of image quality. At 50:1, you don't see any compression artifacts. The scheme could also be used for images.

I've split the heightmap into chunks of size 64x64, which are all compressed separately. I load the whole compressed heightmap (the 20MB) into memory and then I decompress single chunks on-the-fly. Decompression of a single chunk takes only a few microseconds, which is fast enough for my purpose.

Which format you finally choose depends on your requirements:
- If you have a HUGE image (>1GB) and decompression doesn't need to be too fast, you can use wavelet compression or JPEG.
- If you need to be very fast, but the data is not too big, use DXTC. It has a 12:1 compression ratio for 24Bit RGB images IIRC.
- If your image is cartoon-like (lots of constant areas), you may use GIF or simply run-length-encoding (RLE). GIF is lossless compression and has very bad ratios for natural images. Moreover, it can only compress images with 256 colors.
- PNG is rather slow, and I don't know if it's lossy or lossless. There is a lossless version, but someone told me it can also do lossy compression.

Share this post


Link to post
Share on other sites
Quote:

PNG is rather slow, and I don't know if it's lossy or lossless.

PNG is lossless.


A definite go-with-that depends on the exact requirements of your application.

Lots of small images can be compressed using JPEG, decompression is fast enough for realtime applications in that case. You can try different compression ratios offline (e.g. using Photoshop or Paint Shop Pro) to get an idea of the quality tradeoff. From my experience, compression factors of 1% to 10% are a reasonable compromise.

TGA is a bad choice - simple RLE won't reduce the image size much due to the non-linear pixel correlations. I'd suggest looking at vector quantisation. The compression is very time-consuming, but decompression is very very fast.

So if you can affort a big amount of offline pre-processing and only demand fast decompression, go with vector quantisation.
Here's an article about it.

Good luck,
Pat.

Share this post


Link to post
Share on other sites
Thank you everyone very much. I suppose I should be more specific about my requirements. Well first of all I'm working on a pocket pc so thats the reason for the memory concern - however its mostly a space concern not so much one for speed since I'm not really doing much animation. I'm going to be loading in around 20 photo style(non-cartoon images but w/the need for transparancy) when the user press a specific UI button and those 20 images will be chosen on the fly from a "large" file of images. So, I think I am really only concerned about storing as many images as I can in this "large" file.

So I guess that rules out jpeg since it can't handle transparency. I've checked out everyone's ideas and this is where I'm at:

DXTC - It seems that this would mainly benefit in speed - is that correct?

png - It seems that png would be overkill since I've read that the compression advantage over gif is not that much and I don't think I really need any of its other qualities.

Vector quantization - It sounds good, but I would only have a limited amount of time to implement it and I'm worried it might take quite a bit of time.


I'd appreciate it if anyone has anything to say - if not thanks for the help so far very much.

Share this post


Link to post
Share on other sites
You can use alpha transparency with JPEGs!
Just split your image into alpha channel and RGB colro data.
Process the color data with jpeglib and either quantise the alpha channel to 4 bpp (provides a stable 2:1 compression ratio) and process it with huffman and/or RLE. You can also process the alpha channel with DXTC compression (4:1 compression ratio) and post-process the result with zlib (fast and very good compression, too).
I'd go with jpeg + alpha. It really isn't that hard to do. Just seperate alpha from color data and multiplex (combine) channels after uncompressing.


Share this post


Link to post
Share on other sites
Thanks for the information - however do you know of any resources that I can use to help with seperating the alpha and rgb data? I'm new to this kind of thing and so more details would be ideal.

Thanks,
toxie

Share this post


Link to post
Share on other sites
It's very simple.



// pixel structure. change if colors are stored RGBA or ARGB
typedef struct _pixel_t {
unsigned int blue : 8;
unsigned int green : 8;
unsigned int red : 8;
unsigned int alpha : 8;
} pixel_t;

// split into rgb and alpha channel.
void DeMultiplexImage(unsigned int *rgba, const int pixels, unsigned char **rgb, unsigned char **alpha)
{
int i = 0, j = 0;
pixel_t *img = (pixel_t*)rgba;

*rgb = (unsigned char*)malloc(pixels * 3);
*alpha = (unsigned char*)malloc(pixels);
for (;i < pixels; ++i, j += 3) {
rgb[j + 0] = img[i].red;
rgb[j + 1] = img[i].green;
rgb[j + 2] = img[i].blue;
alpha[i] = img[i].alpha;
}
}

// combine rgb and alpha channel
void MultiplexImage(unsigned char *rgb unsigned char *alpha, const int pixels, unsigned int **rgba)
{
int i = 0; j = 0;
pixel_t *img;

*rgba = (unsigned int*)malloc(pixels * 4);
img = (pixel_t*)(*rgba);

for (; i < pixels; ++i, j += 3) {
img[i].red = rgb[j + 0];
img[i].green = rgb[j + 1];
img[i].blue = rgb[j + 2];
img[i].alpha = alpha[i];
}
}



You can pass the rgb pointer to jpeglib and the alpha pointer to your S3TC compressor.
You see, RGBA values are interleaved for each pixel in most formats. So a pixel is a vector with three (e.g. color only) or four (color plus alpha) components of 8 bits each (other sizes possible, ie. 5 bits for red, 6 for green, 5 for blue = 16 bits total a.s.o.). You can either seperate these components using bit operations (masking and shifting) or cast to a suitable struct (like I did in the code sample).

If you need an algorithm for S3TC, let me know. I have written some and they work very well. I can also provide you with a patched jpeglib that supports JPEG+Alpha in various modes (S3TC and JPEG grayscale compression for alpha channel).

Share this post


Link to post
Share on other sites
Thank you again, and yes I would definitely like the s3tc compressor and a patched jpeglib. You could email it to me at toxic_crap@yahoo.com

To confirm my understanding:

- I will extract an image's pixel components and store the rgb data as a jpeg and compress the alpha data(with your s3tc compressor possibly).

- When the image is to be loaded, I will then multiplex the data back together.


What I'm still unsure of is say I'm going to use a .png image as the image format to extract data from and as the format to multiplex back to. Will I just need to find out the .png format and write an algorithm to extract the data and then reformat the data in that format? Do you think that it will be an acceptable amount of processing time to multiplex about 20 small images on a 400-600 mhz processor?

Thank you so much

Share this post


Link to post
Share on other sites
Quote:
Original post by toxy
What I'm still unsure of is say I'm going to use a .png image as the image format to extract data from and as the format to multiplex back to. Will I just need to find out the .png format and write an algorithm to extract the data and then reformat the data in that format?

Basically, yes. But you wouldn't want to parse PNGs yourself. Use an image library such as libpng to load them.
Quote:

Do you think that it will be an acceptable amount of processing time to multiplex about 20 small images on a 400-600 mhz processor?

I initially designed the 32 bit jpeg stuff for realtime processing and it performs at about 6000 to 8000 frames per second (image sizes 128x64 to 256x256) on a 1GHz machine. I assume it to be reasonably fast on a 400MHz machine (provided an FPU is present). I also scaled the alpha channel down by a 50% at the biggest dimension and re-scaled on decompression using linear interpolation. This will reduce the size of the alpha channel further.

Here's the addon to jpeglib that provides memory to memory compression:


// jpeg.c
#include <stdio.h>
#include "jpeg/jpeglib.h"

/*
This a custom destination manager for jpeglib that
enables the use of memory to memory compression.

See IJG documentation for details.
*/

typedef struct {
struct jpeg_destination_mgr pub; /* base class */
JOCTET* buffer; /* buffer start address */
int bufsize; /* size of buffer */
size_t datasize; /* final size of compressed data */
int* outsize; /* user pointer to datasize */
int errcount; /* counts up write errors due to
buffer overruns */

} memory_destination_mgr;

typedef memory_destination_mgr* mem_dest_ptr;

/* ------------------------------------------------------------- */
/* MEMORY DESTINATION INTERFACE METHODS */
/* ------------------------------------------------------------- */


/* This function is called by the library before any data gets written */
METHODDEF(void)
init_destination (j_compress_ptr cinfo)
{
mem_dest_ptr dest = (mem_dest_ptr)cinfo->dest;

dest->pub.next_output_byte = dest->buffer; /* set destination buffer */
dest->pub.free_in_buffer = dest->bufsize; /* input buffer size */
dest->datasize = 0; /* reset output size */
dest->errcount = 0; /* reset error count */
}

/* This function is called by the library if the buffer fills up

I just reset destination pointer and buffer size here.
Note that this behavior, while preventing seg faults
will lead to invalid output streams as data is over-
written.
*/

METHODDEF(boolean)
empty_output_buffer (j_compress_ptr cinfo)
{
mem_dest_ptr dest = (mem_dest_ptr)cinfo->dest;
dest->pub.next_output_byte = dest->buffer;
dest->pub.free_in_buffer = dest->bufsize;
++dest->errcount; /* need to increase error count */

return TRUE;
}

/* Usually the library wants to flush output here.

I will calculate output buffer size here.
Note that results become incorrect, once
empty_output_buffer was called.
This situation is notified by errcount.
*/

METHODDEF(void)
term_destination (j_compress_ptr cinfo)
{
mem_dest_ptr dest = (mem_dest_ptr)cinfo->dest;
dest->datasize = dest->bufsize - dest->pub.free_in_buffer;
if (dest->outsize) *dest->outsize += (int)dest->datasize;
}

/* Override the default destination manager initialization
provided by jpeglib. Since we want to use memory-to-memory
compression, we need to use our own destination manager.
*/

GLOBAL(void)
jpeg_memory_dest (j_compress_ptr cinfo, JOCTET* buffer, int bufsize, int* outsize)
{
mem_dest_ptr dest;

/* first call for this instance - need to setup */
if (cinfo->dest == 0) {
cinfo->dest = (struct jpeg_destination_mgr *)
(*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_PERMANENT,
sizeof (memory_destination_mgr));
}

dest = (mem_dest_ptr) cinfo->dest;
dest->bufsize = bufsize;
dest->buffer = buffer;
dest->outsize = outsize;
/* set method callbacks */
dest->pub.init_destination = init_destination;
dest->pub.empty_output_buffer = empty_output_buffer;
dest->pub.term_destination = term_destination;
}

/* ------------------------------------------------------------- */
/* MEMORY SOURCE INTERFACE METHODS */
/* ------------------------------------------------------------- */

/* Called before data is read */
METHODDEF(void)
init_source (j_decompress_ptr dinfo)
{
/* nothing to do here, really. I mean. I'm not lazy or something, but...
we're actually through here. */

}

/* Called if the decoder wants some bytes that we cannot provide... */
METHODDEF(boolean)
fill_input_buffer (j_decompress_ptr dinfo)
{
/* we can't do anything about this. This might happen if the provided
buffer is either invalid with regards to its content or just a to
small bufsize has been given. */


/* fail. */
return FALSE;
}

/* From IJG docs: "it's not clear that being smart is worth much trouble"
So I save myself some trouble by ignoring this bit.
*/

METHODDEF(void)
skip_input_data (j_decompress_ptr dinfo, INT32 num_bytes)
{
/* There might be more data to skip than available in buffer.
This clearly is an error, so screw this mess. */

if ((size_t)num_bytes > dinfo->src->bytes_in_buffer) {
dinfo->src->next_input_byte = 0; /* no buffer byte */
dinfo->src->bytes_in_buffer = 0; /* no input left */
} else {
dinfo->src->next_input_byte += num_bytes;
dinfo->src->bytes_in_buffer -= num_bytes;
}
}

/* Finished with decompression */
METHODDEF(void)
term_source (j_decompress_ptr dinfo)
{
/* Again. Absolute laziness. Nothing to do here. Boring. */
}

GLOBAL(void)
jpeg_memory_src (j_decompress_ptr dinfo, unsigned char* buffer, size_t size)
{
struct jpeg_source_mgr* src;

/* first call for this instance - need to setup */
if (dinfo->src == 0) {
dinfo->src = (struct jpeg_source_mgr *)
(*dinfo->mem->alloc_small) ((j_common_ptr) dinfo, JPOOL_PERMANENT,
sizeof (struct jpeg_source_mgr));
}

src = dinfo->src;
src->next_input_byte = buffer;
src->bytes_in_buffer = size;
src->init_source = init_source;
src->fill_input_buffer = fill_input_buffer;
src->skip_input_data = skip_input_data;
src->term_source = term_source;
/* IJG recommend to use their function - as I don't know shit
about how to do better, I follow this recommendation */

src->resync_to_restart = jpeg_resync_to_restart;
}



Here's the S3TC DXT5 compression for alpha values:



// s3tc.cpp

#include <windows.h> // <-- to get rid of this, replace ZeroMemory macros with memset()

#include <algorithm>
#include <cstdlib>
#include <cstring>

///////////////////////////////////////////////////////////////////////
// S3TC IMPLEMENTATION (DXT5-algorithm) //
///////////////////////////////////////////////////////////////////////

namespace S3TC {

/// DXT interpolated alpha block
struct DXTAlphaBlock {
unsigned char Alpha0; //!< first alpha value
unsigned char Alpha1; //!< second alpha value
unsigned char Code[6]; //!< alpha value palette index
};

static const int SIX_ENTRY_BLOCK = 0; //!< index of 6-entry block
static const int EIGHT_ENTRY_BLOCK = 1; //!< index of 8-entry block

static DXTAlphaBlock sAlphas[2]; //!< alpha code blocks
static unsigned char sTempBlock[16]; //!< temporary storage

/*! Pick alpha values at extremes
*
* \param pBlock block containing alpha pixel information
* \param nFirst output value for lowest alpha
* \param nSecond output value for highest alpha
**/

static void ChooseAlphaValues (const unsigned char* pBlock, int& nFirst, int& nSecond) {
// need initialization with invalid values to ensure definite assignment
int nLowest = 256, nHighest = -1;
for (int i = 0; i < 16; ++i) {
int nAlpha = pBlock[i];
if (nAlpha < nLowest) {
nLowest = nAlpha;
} else if (nAlpha > nHighest) {
nHighest = nAlpha;
}
nFirst = nLowest;
nSecond = nHighest;
}
}

/*! Generate alpha table
*
* \param pAlphas output array that receives interpolated values
* \param nFirst first explicit alpha entry to interpolat from
* \param nFirst second explicit alpha entry to interpolate from
**/

static void GenerateAlphaTable(int* pAlphas, const int nFirst, const int nSecond) {

// first entries are always explicit values
pAlphas[0] = nFirst;
pAlphas[1] = nSecond;

if (nFirst <= nSecond) {
// 6 entries in table - see MSDN
pAlphas[2] = (4*nFirst + 1*nSecond + 2) / 5;
pAlphas[3] = (3*nFirst + 2*nSecond + 2) / 5;
pAlphas[4] = (2*nFirst + 3*nSecond + 2) / 5;
pAlphas[5] = (1*nFirst + 4*nSecond + 2) / 5;
pAlphas[6] = 0;
pAlphas[7] = 255;
} else {
// 8 entries in table - see MSDN
pAlphas[2] = (6*nFirst + 1*nSecond + 3) / 7;
pAlphas[3] = (5*nFirst + 2*nSecond + 3) / 7;
pAlphas[4] = (4*nFirst + 3*nSecond + 3) / 7;
pAlphas[5] = (3*nFirst + 4*nSecond + 3) / 7;
pAlphas[6] = (2*nFirst + 5*nSecond + 3) / 7;
pAlphas[7] = (1*nFirst + 6*nSecond + 3) / 7;
}
}

/*! Assign codes to alpha block
*
* \param AlphaBlock alpha block that receives codes for interpolated alpha entries
* \param pCodes array containing 3 bit alpha codes for each pixel
**/

static void AssignAlphaCodes (DXTAlphaBlock& AlphaBlock, const int* pCodes) {

// process 3 bytes
AlphaBlock.Code[0] = pCodes[0] | (pCodes[1] << 3) | ((pCodes[2] & 0x3) << 6);
AlphaBlock.Code[1] = (pCodes[2] >> 2) | (pCodes[3] << 1) | (pCodes[4] << 4) | ((pCodes[5] & 0x1) << 7);
AlphaBlock.Code[2] = (pCodes[5] >> 1) | (pCodes[6] << 2) | (pCodes[7] << 5);

// process 3 bytes
AlphaBlock.Code[3] = pCodes[8] | (pCodes[9] << 3) | ((pCodes[10] & 0x3) << 6);
AlphaBlock.Code[4] = (pCodes[10] >> 2) | (pCodes[11] << 1) | (pCodes[12] << 4) | ((pCodes[13] & 0x1) << 7);
AlphaBlock.Code[5] = (pCodes[13] >> 1) | (pCodes[14] << 2) | (pCodes[15] << 5);
}

/*! Retrieve codes from alpha block
*
* \param AlphaBlock alpha block containing codes for interpolated alpha entries
* \param pCodes array recieving 3 bit alpha codes for each pixel
**/

static void RetrieveAlphaCodes (const DXTAlphaBlock& AlphaBlock, int* pCodes) {

// process 3 bytes
int nCode = AlphaBlock.Code[0];

pCodes[0] = nCode & 7;
pCodes[1] = (nCode >> 3) & 7;
pCodes[2] = (nCode >> 6);
nCode = AlphaBlock.Code[1];
pCodes[2] |= (nCode & 1) << 2;
pCodes[3] = (nCode >> 1) & 7;
pCodes[4] = (nCode >> 4) & 7;
pCodes[5] = (nCode >> 7);
nCode = AlphaBlock.Code[2];
pCodes[5] |= (nCode & 3) << 1;
pCodes[6] = (nCode >> 2) & 7;
pCodes[7] = (nCode >> 5);

// process 3 bytes
nCode = AlphaBlock.Code[3];

pCodes[8] = nCode & 7;
pCodes[9] = (nCode >> 3) & 7;
pCodes[10]= (nCode >> 6);
nCode = AlphaBlock.Code[4];
pCodes[10] |= (nCode & 1) << 2;
pCodes[11] = (nCode >> 1) & 7;
pCodes[12] = (nCode >> 4) & 7;
pCodes[13] = (nCode >> 7);
nCode = AlphaBlock.Code[5];
pCodes[13] |= (nCode & 3) << 1;
pCodes[14] = (nCode >> 2) & 7;
pCodes[15] = (nCode >> 5);
}


/*! Generate bit mask for alpha block
*
* \param AlphaBlock alpha code block containing explicit alpha values, codes
* for each pixel will be generated
* \param pBlock block containing per-pixel alpha information
* \param pTemp output for remapped alpha values - optional
**/

static void GenerateAlphaBlock
(
DXTAlphaBlock& AlphaBlock,
const unsigned char* pBlock,
unsigned char* pTemp = NULL
)
{

int Alphas[8]; // generated alpha table
int Codes[16]; // index into alpha table for each texel in block

GenerateAlphaTable(Alphas, AlphaBlock.Alpha0, AlphaBlock.Alpha1);

for (int i = 0; i < 16; ++i) {
// find codes by applying closest match search on generated alpha table
int nBest = 256;
int nMatch;
int nAlpha = pBlock[i];
int d;

// macro to make unrolled loop more readable
#define UPDATE_MATCH(index) d = std::abs(nAlpha - Alphas[(index)]); if (d < nBest) { nBest = d; nMatch= (index); }

// unrolled loop
UPDATE_MATCH(0); UPDATE_MATCH(1);
UPDATE_MATCH(2); UPDATE_MATCH(3);
UPDATE_MATCH(4); UPDATE_MATCH(5);
UPDATE_MATCH(6); UPDATE_MATCH(7);

#undef UPDATE_MATCH

Codes[i] = nMatch;
}

if (pTemp != NULL) {
// assign resulting alpha block if requested
for (int i = 0; i < 16; ++i) {
pTemp[i] = Alphas[Codes[i]];
}
}

AssignAlphaCodes (AlphaBlock, Codes);
}

/// calc. RMS for two blocks of alpha values
static int GetAlphaRMS (const unsigned char* pBlock, const unsigned char* pTemp) {

int nResult = 0;
for (int i = 0; i < 16; ++i) {
// parse through block and sum up squared error
int d = pBlock[i] - pTemp[i];
nResult += d*d;
}

// "real" RMS requires normalizing... we skip that part since we work with
// integers here and the loss of 4 bits of precision defeats the whole
// purpose of this function at times (e.g. RMS below 16)
// nResult /= 16;

return nResult;
}

static DXTAlphaBlock* CreateAlphaBlock(const unsigned char* pBlock) {

// "perfect result" version: select most accurate block format

int a0, a1;

// select explicit alphas
ChooseAlphaValues(pBlock, a0, a1);

// prepare 6 entry block
sAlphas[SIX_ENTRY_BLOCK].Alpha0 = a0;
sAlphas[SIX_ENTRY_BLOCK].Alpha1 = a1;

// prepare 8 entry block
sAlphas[EIGHT_ENTRY_BLOCK].Alpha0 = a1;
sAlphas[EIGHT_ENTRY_BLOCK].Alpha1 = a0;

// temporary output of resulting alpha block is only necessary for
// blocks with more than 1 distinct alpha values - see notes below
GenerateAlphaBlock(sAlphas[SIX_ENTRY_BLOCK], pBlock, (a0 != a1) ? sTempBlock : NULL);

// if no errors are calculated force usage of first generated 6 entry block
// by settting errors to equal initial values
int nErr0 = 0, nErr1 = 0;

if (a0 != a1) {
// deciding is only necessary for blocks containing more than 1 alpha value
// \note at this point we might have only two distinct alpha values to cope
// with, while generating RMS only seems necessary for more than 2 values.
// regarding the memory/computational effort required to count the distinct
// values, it seems to be a reasonable compromise to only generate the
// second block/calc. RMS, if at least two values are present...

// get RMS for 6 entry block
nErr0 = GetAlphaRMS(pBlock, sTempBlock);
// generate alpha block using 8 interpolated values
GenerateAlphaBlock(sAlphas[EIGHT_ENTRY_BLOCK], pBlock, sTempBlock);
// find RMS for this block
nErr1 = GetAlphaRMS(pBlock, sTempBlock);
}

// select block version with lowest RMS
int nBest = (nErr0 <= nErr1) ? SIX_ENTRY_BLOCK : EIGHT_ENTRY_BLOCK;

// return best match
return &sAlphas[nBest];
}

/// Decode DXT5 alpha block
static void DecodeAlphaBlock (const DXTAlphaBlock& rAlpha, unsigned char* pBlock) {

int Codes[16];
int Alphas[8];

GenerateAlphaTable(Alphas, rAlpha.Alpha0, rAlpha.Alpha1);
RetrieveAlphaCodes(rAlpha, Codes);

for (int i = 0; i < 16; ++i) {
// assign values from codes
int j = Codes[i];
pBlock[i] = Alphas[j];
}
}


/// Copy a 4x4 block from the input
static void CopyBlock(const unsigned char* Values, const int Width, unsigned char* Block) {

for (int i = 0, n = 0, y = 0; i < 4; ++i, y += Width) {
for (int j = 0; j < 4; ++j) {
Block[n++] = Values[y + j];
}
}

}

/// Copy a nxn block from the input
static void CopyBlockEx(const unsigned char* Values, const int Width, const int SizeX, const int SizeY, unsigned char* Block) {

for (int i = 0, n = 0, y = 0; i < SizeY; ++i, y += Width, n += 4) {
for (int j = 0; j < SizeX; ++j) {
Block[n + j] = Values[y + j];
}
}

}

/// Copy a 4x4 block to the input
static void SaveBlock(const unsigned char* Block, const int Width, unsigned char* Values) {

for (int i = 0, n = 0, y = 0; i < 4; ++i, y += Width) {
for (int j = 0; j < 4; ++j) {
Values[y + j] = Block[n++];
}
}

}

/// Copy a nxn block to the input
static void SaveBlockEx(const unsigned char* Block, const int Width, const int SizeX, const int SizeY, unsigned char* Values) {

for (int i = 0, n = 0, y = 0; i < SizeY; ++i, y += Width, n += 4) {
for (int j = 0; j < SizeX; ++j) {
Values[y + j] = Block[n + j];
}
}

}
}

// simple box filter
static unsigned char* ScaleDownY(const unsigned char* Values, const int Width, const int Height) {

unsigned char* result = static_cast<unsigned char*> ( malloc (Width*Height) );

const unsigned char* Top = Values;
const unsigned char* Bottom = Values + Width;

int iy = 2 * Width;

for (int y = 0, i = 0; y < Height; ++y) {

for (int x = 0; x < Width; ++x, ++i) {

// slightly optimized mem-ops

int n = Top[x];
n += Bottom[x];

// effectivly makes a shift by 1 bit
n /= 2;

result[i] = n;
}

Top += iy;
Bottom += iy;
}

return result;
}

// simple box filter
static unsigned char* ScaleDownX(const unsigned char* Values, const int Width, const int Height) {

unsigned char* result = static_cast<unsigned char*> ( malloc (Width*Height) );

int iy = 2 * Width;

for (int y = 0, i = 0; y < Height; ++y) {

for (int x = 0, ix = 0; x < Width; ++x, ix += 2, ++i) {

// slightly optimized mem-ops

int n = Values[ix];
n += Values[ix+1];

// effectivly makes a shift by 1 bit
n /= 2;

result[i] = n;
}

Values += iy;
}

return result;
}

// using bi-linear filtering...
static unsigned char* ScaleUpY(const unsigned char* Values, const int Width, const int Height, unsigned char* Result) {

unsigned char* TopDest = Result;
unsigned char* BottomDest = Result + Width;

int iy = 2 * Width;

for (int y = 0, i = 0; y < Height; ++y) {

for (int x = 0; x < Width; ++x, ++i) {

int p = Values[i];

// check neighbours
int t = (y > 0) ? Values[i-Width] : p;
int b = (y < Height-1) ? Values[i+Width] : p;

// get gradients
int d1 = abs(p - b);
int d2 = abs(p - t);

// perform interpolation based on linear gradients
TopDest[x] = (d1 < d2) ? (2*p + t + b + 2) / 4 : (2*p + t + 1) / 3;
BottomDest[x]= (d1 > d2) ? (2*p + t + b + 2) / 4 : (2*p + b + 1) / 3;
}

TopDest += iy;
BottomDest += iy;
}


return Result;
}

// using bi-linear filtering...
static unsigned char* ScaleUpX(const unsigned char* Values, const int Width, const int Height, unsigned char* Result) {

for (int y = 0, i = 0, j = 0; y < Height; ++y) {

for (int x = 0; x < Width; ++x, ++i, j += 2) {

int p = Values[i];

// check neighbours
int l = (x > 0) ? Values[i-1] : p;
int r = (x < Width-1) ? Values[i+1] : p;

// get gradients
int d1 = abs(p - l);
int d2 = abs(p - r);

// perform interpolation based on linear gradients
Result[j] = (d1 > d2) ? (2*p + l + r + 2) / 4 : (2*p + l + 1) / 3;
Result[j+1] = (d1 < d2) ? (2*p + l + r + 2) / 4 : (2*p + r + 1) / 3;
}

}


return Result;
}

const int DXT5PackAlphaValues (const unsigned char* Values, const int Width, const int Height, unsigned char* Output) {

using namespace S3TC;

*reinterpret_cast<int*>(Output) = MAKEFOURCC('S','3','T','C');

unsigned char* Scaled = NULL;

int w = Width/*/2*/;
int h = Height/*/2*/;

// select which direction to scale down
if (w < h) {
reinterpret_cast<int*>(Output)[1] = 0;
h /= 2;
Scaled = ScaleDownY(Values, w, h);
} else {
reinterpret_cast<int*>(Output)[1] = 1;
w /= 2;
Scaled = ScaleDownX(Values, w, h);
}

int bx = w / 4; // horizontal blocks
int by = h / 4; // vertical blocks
int rx = w & 3; // horizontal remainder
int ry = h & 3; // vertical remainder
int iy = w * 4; // pointer increment for vertical block line increase

unsigned char Block[16];// work block

DXTAlphaBlock* AlphaOut = reinterpret_cast<DXTAlphaBlock*> ( Output + 8 );

int oy = 0;
for (int y = 0; y < by; ++y, oy += iy) {
// parse through block lines

int ox = oy;
for (int x = 0; x < bx; ++x, ox += 4) {
// parse through blocks in current line

CopyBlock(Scaled + ox, w, Block);

DXTAlphaBlock* alpha = CreateAlphaBlock(Block);

*AlphaOut++ = *alpha;
}

if (rx > 0) {
// create odd block
ZeroMemory(Block, sizeof(Block));

CopyBlockEx(Scaled + ox, w, rx, 4, Block);
DXTAlphaBlock* alpha = CreateAlphaBlock(Block);
*AlphaOut++ = *alpha;
}
}

if (ry > 0) {
// process odd blocks at bottom
ZeroMemory(Block, sizeof(Block));

int ox = oy;
for (int x = 0; x < bx; ++x, ox += 4) {
// parse through blocks in current line

CopyBlockEx(Scaled + ox, w, 4, ry, Block);

DXTAlphaBlock* alpha = CreateAlphaBlock(Block);

*AlphaOut++ = *alpha;
}

if (rx > 0) {
// create odd block
ZeroMemory(Block, sizeof(Block));

CopyBlockEx(Scaled + ox, w, rx, ry, Block);
DXTAlphaBlock* alpha = CreateAlphaBlock(Block);
*AlphaOut++ = *alpha;
}
}

free(Scaled);

return static_cast<int> (reinterpret_cast<unsigned char*>(AlphaOut) - Output);
}

const int DXT5UnpackAlphaValues (const unsigned char* Values, const int Width, const int Height, unsigned char* Output) {

using namespace S3TC;

int w = Width/*/2*/;
int h = Height/*/2*/;

if (reinterpret_cast<const int*>(Values)[1] == 0) {
h /= 2;
} else {
w /= 2;
}

unsigned char* Scaled = static_cast<unsigned char*> ( malloc (w*h) );

int bx = w / 4; // horizontal blocks
int by = h / 4; // vertical blocks
int rx = w & 3; // horizontal remainder
int ry = h & 3; // vertical remainder
int iy = w * 4; // pointer increment for vertical block line increase

unsigned char Block[16];// work block

const DXTAlphaBlock* AlphaIn = reinterpret_cast<const DXTAlphaBlock*> ( Values + 8 );

int n = 0;
int oy = 0;
for (int y = 0; y < by; ++y, oy += iy) {
// parse through block lines

int ox = oy;
for (int x = 0; x < bx; ++x, ox += 4) {
// parse through blocks in current line

DecodeAlphaBlock(AlphaIn[n++], Block);

SaveBlock(Block, w, Scaled + ox);
}

if (rx > 0) {
// create odd block
ZeroMemory(Block, sizeof(Block));

DecodeAlphaBlock(AlphaIn[n++], Block);

SaveBlockEx(Block, w, rx, 4, Scaled + ox);
}
}

if (ry > 0) {
// process odd blocks at bottom
ZeroMemory(Block, sizeof(Block));

int ox = oy;
for (int x = 0; x < bx; ++x, ox += 4) {
// parse through blocks in current line

DecodeAlphaBlock(AlphaIn[n++], Block);

SaveBlockEx(Block, w, 4, ry, Scaled + ox);
}

if (rx > 0) {
// create odd block
ZeroMemory(Block, sizeof(Block));

DecodeAlphaBlock(AlphaIn[n++], Block);

SaveBlockEx(Block, w, rx, ry, Scaled + ox);
}
}

if (reinterpret_cast<const int*>(Values)[1] == 0) {
ScaleUpY(Scaled, w, h, Output);
} else {
ScaleUpX(Scaled, w, h, Output);
}

free(Scaled);

return Width * Height;
}



Note that in the above code, scaling is optional and can be commented out if it hurts performance.

If you need some more code, let me know.
Pat.

Share this post


Link to post
Share on other sites
having problems.....

i used your functions to compress the image in the memory, the compress engine does not have any problems.....

but when i go to the decompress i always have error 201 (improper call) i looked @ the source i just made 1 "copy past" from the example.c from the ijg then changing the jpeg_memory_src and desteny...

can u give a small sample...?

-> EDIT <-

I found the main problem the "writer" just writes the startup bits the header of the file the rest is not in the buffer!

10x in advc

[Edited by - nunoxyz on November 12, 2004 4:39:04 AM]

Share this post


Link to post
Share on other sites
i got it fixed it writes and reads ok but i have a problem my .jpg in 90 of compression is about 100kb! (that is HUGE!!!!!) can any one help me out?

10x in advc

Share this post


Link to post
Share on other sites

This topic is 4775 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this