Sign in to follow this  
MarkS

Byte manipulation question. I *should* know this...

Recommended Posts

So, I'm trying to optimize texture access. Currently, I'm doing four memory accesses per texel to get the red, green, blue and alpha components. It then occurred to me that I'm fetching four bytes out of what is basically an unsigned long and then recombining them. Seems wasteful. I still need to extract the components for alpha blending and such, but I thought I could do it faster. Anyway, the current (working) code looks like this:
// "texture" is of type unsigned char...
offset = ((tv * tex_width) + tu) * 4;
tr = texture[offset];
tg = texture[offset + 1];
tb = texture[offset + 2];
ta = texture[offset + 3];
.
.
.
//"buffer" is of type unsigned long...
*buffer = (ta << 24) | (tr << 16) | (tg << 8) | tb;




What I thought would work and be faster was this:
// "texture" is of type unsigned long...
offset = (tv * tex_width) + tu;
t_val = texture[offset];
tr = (unsigned char)(t_val & 0x000000FF);
tg = (unsigned char)(t_val & 0x0000FF00);
tb = (unsigned char)(t_val & 0x00FF0000);
ta = (unsigned char)(t_val & 0xFF000000);
.
.
.
//"buffer" is of type unsigned long.
*buffer = (ta << 24) | (tr << 16) | (tg << 8) | tb;




Oddly, this doesn't work. I've played around with the byte masks, but the texture is always rendered in one of the component colors and never a combination. I posted this here, instead of Graphics Programming and Theory, because this is more about byte manipulation that graphics. The byte masks may be accessing the wrong components, but that wouldn't cause the texture to be rendered in one component color. This makes me think that I'm not accessing the data like I want. Byte and bit manipulation has always been a weakness for me (not sure why...). Any ideas? [Edited by - maspeir on March 8, 2010 11:34:15 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by maspeir
ta = (unsigned char)(t & 0xFF000000);

You're tossing out all but the top eight bits with that bitwise and... then you're tossing out all but the bottom eight bits with the cast. That equals zero. You need to shift down before you cast.

Share this post


Link to post
Share on other sites
You need to shift the values before assigning.

Instead of

tr = (unsigned char)(t & 0x000000FF);
tg = (unsigned char)(t & 0x0000FF00);
tb = (unsigned char)(t & 0x00FF0000);
ta = (unsigned char)(t & 0xFF000000);

you need to do

tr = (unsigned char)(t & 0x000000FF);
tg = (unsigned char)((t & 0x0000FF00)>>8);
tb = (unsigned char)((t & 0x00FF0000)>>16);
ta = (unsigned char)((t & 0xFF000000)>>24);

Share this post


Link to post
Share on other sites
Thanks! I knew I had to be doing something wrong.

Huh. I works, but I'm seeing a 50% or more DROP in frame rate. Odd... I figured array access would be slower.

Share this post


Link to post
Share on other sites
Quote:
Original post by maspeir
Thanks! I knew I had to be doing something wrong.

Huh. I works, but I'm seeing a 50% or more DROP in frame rate. Odd... I figured array access would be slower.


Array access is only slow when it forces a cache miss. For what you're doing, you're getting one cache miss and then everything else is on the same cache line. So loads on green, blue, and alpha should be extremely fast.

Share this post


Link to post
Share on other sites
Quote:
Original post by Drakonite
You need to shift the values before assigning.

Instead of

tr = (unsigned char)(t & 0x000000FF);
tg = (unsigned char)(t & 0x0000FF00);
tb = (unsigned char)(t & 0x00FF0000);
ta = (unsigned char)(t & 0xFF000000);

you need to do

tr = (unsigned char)(t & 0x000000FF);
tg = (unsigned char)((t & 0x0000FF00)>>8);
tb = (unsigned char)((t & 0x00FF0000)>>16);
ta = (unsigned char)((t & 0xFF000000)>>24);

Except that you're better off doing the shifts first to reduce the size of the masking constants.

tr = (unsigned char)((t) & 0xFF);
tg = (unsigned char)((t>>8) & 0xFF);
tb = (unsigned char)((t>>16) & 0xFF);
ta = (unsigned char)((t>>24) & 0xFF);
And in this case you'll then notice that anding with the masks is redundant anyway thanks to the casts. Thus:

tr = (unsigned char)t;
tg = (unsigned char)(t>>8);
tb = (unsigned char)(t>>16);
ta = (unsigned char)(t>>24);

The next thing then is to discover that several operations such as alpha blending can be done without fully separating the 32-bit colours down into their individual channels. I.e. you can blend the red and the green at the same time etc - the double blend trick.
Or it you use MMX/SSE etc then it gets much quicker still.

I've got a ton of optimised pixel manipulation stuff like this on my website in the Useful Classes section.

Share this post


Link to post
Share on other sites
Thanks for the info, iMalk. I do have a question about one function on the stereopsis page. What are the expected values of "xp" and "yp" in the Bilerp32 function? Texel or pixel coordinates or something else?

Share this post


Link to post
Share on other sites
Quote:
Original post by maspeir
Thanks for the info, iMalk. I do have a question about one function on the stereopsis page. What are the expected values of "xp" and "yp" in the Bilerp32 function? Texel or pixel coordinates or something else?
Those are the off-pixel-centre proportion amounts in the x and y directions. They're both 0 to 255, and control the weighting of the blending of the four texel values. 0 for xp means weigh 100% towards 'a' and 'c', and 255 means weight as much as possible towards 'b' and 'd'. xp then determines how to weight between those two.

Share this post


Link to post
Share on other sites
Or you could maybe use a union:


union
{
unsigned char components[4];
unsigned long rgba;
}PixelUnion;

// "texture" is of type unsigned long...
offset = (tv * tex_width) + tu;
PixelUnion.rgba = texture[offset];

// PixelUnion.components[0], PixelUnion.components[1], PixelUnion.components[2] and PixelUnion.components[3] are the components

Share this post


Link to post
Share on other sites
Quote:
Original post by Vortez
Why not simply do this?

offset = ((tv * tex_width) + tu) * 4;
memcpy(buffer, &texture[offset], 4);

or even

UINT *pData = (UINT*)&texture[offset];
*buffer = *pData;


That's great if all I'm doing is copying from the texture to the offscreen bitmap. However, if I want to do any blending or filtering of the texture, I need the components.

Quote:
Original post by BigJim
Or you could maybe use a union:

*** Source Snippet Removed ***


Interesting...

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this