Byte manipulation question. I *should* know this...

This topic is 2873 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

Recommended Posts

So, I'm trying to optimize texture access. Currently, I'm doing four memory accesses per texel to get the red, green, blue and alpha components. It then occurred to me that I'm fetching four bytes out of what is basically an unsigned long and then recombining them. Seems wasteful. I still need to extract the components for alpha blending and such, but I thought I could do it faster. Anyway, the current (working) code looks like this:
// "texture" is of type unsigned char...
offset = ((tv * tex_width) + tu) * 4;
tr = texture[offset];
tg = texture[offset + 1];
tb = texture[offset + 2];
ta = texture[offset + 3];
.
.
.
//"buffer" is of type unsigned long...
*buffer = (ta << 24) | (tr << 16) | (tg << 8) | tb;


// "texture" is of type unsigned long...
offset = (tv * tex_width) + tu;
t_val = texture[offset];
tr = (unsigned char)(t_val & 0x000000FF);
tg = (unsigned char)(t_val & 0x0000FF00);
tb = (unsigned char)(t_val & 0x00FF0000);
ta = (unsigned char)(t_val & 0xFF000000);
.
.
.
//"buffer" is of type unsigned long.
*buffer = (ta << 24) | (tr << 16) | (tg << 8) | tb;


Oddly, this doesn't work. I've played around with the byte masks, but the texture is always rendered in one of the component colors and never a combination. I posted this here, instead of Graphics Programming and Theory, because this is more about byte manipulation that graphics. The byte masks may be accessing the wrong components, but that wouldn't cause the texture to be rendered in one component color. This makes me think that I'm not accessing the data like I want. Byte and bit manipulation has always been a weakness for me (not sure why...). Any ideas? [Edited by - maspeir on March 8, 2010 11:34:15 PM]

Share on other sites
Quote:
 Original post by maspeirta = (unsigned char)(t & 0xFF000000);

You're tossing out all but the top eight bits with that bitwise and... then you're tossing out all but the bottom eight bits with the cast. That equals zero. You need to shift down before you cast.

Share on other sites
You need to shift the values before assigning.

tr = (unsigned char)(t & 0x000000FF);tg = (unsigned char)(t & 0x0000FF00);tb = (unsigned char)(t & 0x00FF0000);ta = (unsigned char)(t & 0xFF000000);

you need to do
tr = (unsigned char)(t & 0x000000FF);tg = (unsigned char)((t & 0x0000FF00)>>8);tb = (unsigned char)((t & 0x00FF0000)>>16);ta = (unsigned char)((t & 0xFF000000)>>24);

Share on other sites
Thanks! I knew I had to be doing something wrong.

Huh. I works, but I'm seeing a 50% or more DROP in frame rate. Odd... I figured array access would be slower.

Share on other sites
Quote:
 Original post by maspeirThanks! I knew I had to be doing something wrong.Huh. I works, but I'm seeing a 50% or more DROP in frame rate. Odd... I figured array access would be slower.

Array access is only slow when it forces a cache miss. For what you're doing, you're getting one cache miss and then everything else is on the same cache line. So loads on green, blue, and alpha should be extremely fast.

Share on other sites
Quote:
 Original post by DrakoniteYou need to shift the values before assigning.Instead oftr = (unsigned char)(t & 0x000000FF);tg = (unsigned char)(t & 0x0000FF00);tb = (unsigned char)(t & 0x00FF0000);ta = (unsigned char)(t & 0xFF000000);you need to dotr = (unsigned char)(t & 0x000000FF);tg = (unsigned char)((t & 0x0000FF00)>>8);tb = (unsigned char)((t & 0x00FF0000)>>16);ta = (unsigned char)((t & 0xFF000000)>>24);
Except that you're better off doing the shifts first to reduce the size of the masking constants.
tr = (unsigned char)((t) & 0xFF);tg = (unsigned char)((t>>8) & 0xFF);tb = (unsigned char)((t>>16) & 0xFF);ta = (unsigned char)((t>>24) & 0xFF);
And in this case you'll then notice that anding with the masks is redundant anyway thanks to the casts. Thus:
tr = (unsigned char)t;tg = (unsigned char)(t>>8);tb = (unsigned char)(t>>16);ta = (unsigned char)(t>>24);

The next thing then is to discover that several operations such as alpha blending can be done without fully separating the 32-bit colours down into their individual channels. I.e. you can blend the red and the green at the same time etc - the double blend trick.
Or it you use MMX/SSE etc then it gets much quicker still.

I've got a ton of optimised pixel manipulation stuff like this on my website in the Useful Classes section.

Share on other sites
Thanks for the info, iMalk. I do have a question about one function on the stereopsis page. What are the expected values of "xp" and "yp" in the Bilerp32 function? Texel or pixel coordinates or something else?

Share on other sites
Quote:
 Original post by maspeirThanks for the info, iMalk. I do have a question about one function on the stereopsis page. What are the expected values of "xp" and "yp" in the Bilerp32 function? Texel or pixel coordinates or something else?
Those are the off-pixel-centre proportion amounts in the x and y directions. They're both 0 to 255, and control the weighting of the blending of the four texel values. 0 for xp means weigh 100% towards 'a' and 'c', and 255 means weight as much as possible towards 'b' and 'd'. xp then determines how to weight between those two.

Share on other sites
Ah. Thanks. And thanks for the links.

Share on other sites
Why not simply do this?

offset = ((tv * tex_width) + tu) * 4;
memcpy(buffer, &texture[offset], 4);

or even

UINT *pData = (UINT*)&texture[offset];
*buffer = *pData;

Share on other sites
Or you could maybe use a union:

union {    unsigned char components[4];    unsigned long rgba;}PixelUnion;// "texture" is of type unsigned long...offset = (tv * tex_width) + tu;PixelUnion.rgba = texture[offset];// PixelUnion.components[0], PixelUnion.components[1], PixelUnion.components[2] and PixelUnion.components[3] are the components

Share on other sites
Quote:
 Original post by VortezWhy not simply do this?offset = ((tv * tex_width) + tu) * 4;memcpy(buffer, &texture[offset], 4);or even UINT *pData = (UINT*)&texture[offset];*buffer = *pData;

That's great if all I'm doing is copying from the texture to the offscreen bitmap. However, if I want to do any blending or filtering of the texture, I need the components.

Quote:
 Original post by BigJimOr you could maybe use a union:*** Source Snippet Removed ***

Interesting...