Archived

This topic is now archived and is closed to further replies.

Jiia

Converting 32 bit to 16 bit

Recommended Posts

There's not much to ramble on about. I'm just trying to find the fastest way to render a 32 bit image to a 16 bit destination. This is the fastest I've tried so far. The only variables not set in this function are the RPixShR,RPixShG,& RPixShB. They are equal to the number of bits a 255 color value needs to be shifted over to the right for the 16 bit screen value, and are used to dump off the extra bits. Also, the XRes & YRes are the screen resolution.
ULONG *Src = Page->Bits; // Custom back buffer
USHORT *Des = (USHORT*) SurfaceDesc.lpSurface; // the real back buffer
LONG ExSpan = (SurfaceDesc.lPitch>>1) - Setup.XRes;
LONG y = Setup.YRes;
LONG x;
INT R16s = RPixShR;
INT G16s = RPixShG + 8;
INT B16s = RPixShB + 16;
while(y--)
{
	x = Setup.XRes;
	while(x--)
	{
		*Des = USHORT(	((((*Src) & 0x000000FF) >> R16s) << LPixShR) |
				((((*Src) & 0x0000FF00) >> G16s) << LPixShG) |
				((((*Src) & 0x00FF0000) >> B16s) << LPixShB));
		Src++;
		Des++;
	}
	Des += ExSpan;	// Any Extra Pitch
}     
I would have never thought this method would be the one that won my timing, but it did. But it is still not very fast. Does anyone know of a good method of doing this? The Destination is usually going to be video memory, so I would think the best aproach would be to do as little as possible with it. [edited by - Jiia on September 1, 2002 12:00:44 AM]

Share this post


Link to post
Share on other sites
well, you gain by far more speed just by using MMX.. mmx is thought and made for such conversions.. and if thats not enough, you can even use the SSE-extensions for MMX assuming you have a p3,p4,or athlonXP..

"take a look around" - limp bizkit
www.google.com

Share this post


Link to post
Share on other sites
it's not a great help, but replace the y-- and x-- with prefixed decrements, i.e. --y and --x, they keep you from pushing and popping the stack all the time - you'll have to increase the initial values by one of course, though.

Apart from that: like davepermen says.

EDIT:
You could implement specific versions of the function for known pixel formats, such as RGB 565 and 555, getting rid of half of the bit shifts. Just change the value you're ANDing, i.e. instead of using 0x0000FF00 use 0x0000F800 for RGB 565 and shift, uh, hm, 6 bits to the right. Unless I'm mistaken.

YET ANOTHER EDIT:
To get rid of that typecast, you could extend the idea above to two pixels per loop iteration and using 32-bit uints. Odd widths are pretty rare anyway, but you could have a seperate version for odd ones that does two pixels at a time inside the x loop and does the last pixel after the x loop but still inside the y loop.

Oh and post your final version again so I can take it apart further.

- JQ
Full Speed Games. Coming soon.

[edited by - JonnyQuest on September 2, 2002 6:53:32 AM]

Share this post


Link to post
Share on other sites
"i.e. instead of using 0x0000FF00 use 0x0000F800 for RGB 565"

I'm not sure what you mean by the F8. Is that a shortcut? I come up with 5 bits as being 0x1F, and 6 bits as being 0x3F. I'm not so sure I understand.. This is what I came up with for 565, or in binary RRRRRGGGGGGBBBBB. Unfortunately, it doesn't work. Some vertical lines are being skipped (meaning the same x pixels are being skipped each line). I cannot figure out why..
ULONG *Src = Page.Bits; // custom back buffer
ULONG *Des = (ULONG*) SurfaceDesc.lpSurface; // real back buffer
// not so sure about >>1 being safe below
LONG ExSpan = ((SurfaceDesc.lPitch>>1) - Setup.XRes)>>1;
LONG y = Setup.YRes + 1;
LONG x;
while(--y)
{
x = Setup.XRes >> 1;
while(x--)
{
*Des = ( (((*Src) & 0x0000001F) << 27) |
(((*Src) & 0x00003F00) << 13) |
(((*Src) & 0x001F0000)) |
(((*(++Src)) & 0x0000001F) << 11) |
(((*Src) & 0x00003F00) >> 3) |
(((*Src) & 0x001F0000) >> 5));
Src++;
Des++;
}
Des += ExSpan; // Any Extra Pitch
}


Ha, don't yell at me if I did something stupid. It won't be the zillionth time. My custom pixels are stored (in hex spacing) 0x00BBGGRR. And my video card is stored like the binary format I mentioned above, so I had to shift around in different directions to do 2 pixels at once. I hope it's not impossible to read that way. And the only reason I didn't use the --x is because of the glitch, where I'm trying to keep it simple until I figure out what is wrong. Thanks for all of your help and advice.

[edited by - Jiia on September 2, 2002 1:20:13 PM]

Share this post


Link to post
Share on other sites
To be blunt: you''re way off

Your RRRRRGGG GGGBBBBB idea is right, but you got it wrong when you converted it to hex.

You have to think of each hex digit as four binary digits (also referred to as one nibble in very old-skool geek speak)

i.e.
0000 means 0x0
0001 means 0x1
0010 means 0x2
0100 means 0x4
1000 means 0x8

and combinations thereof.

SO we write the pixel format like this:

RRRR RGGG GGGB BBBB

To get all Rs, use this:
0xF800
Green:
0x07E0
Blue:
0x001F

(damn I hope I didn''t get that wrong )

Well, to get the red green and blue channels to map to that, you need to use the 5 or 6 most significant bits, not the least significant bits like you''re doing right now.

So your lines inside the loop would be:

*Des = ( (((*Src) & 0x000000F8) << 8) |
(((*Src) & 0x0000FC00) >> 5) |
(((*Src) & 0x00F80000) >> 19) |
(((*(++Src)) & 0x000000F8) << 24) |
(((*Src) & 0x0000FC00) << 11) |
(((*Src) & 0x00F80000) >> 3));


I''m not 100% sure, but that should do the trick.


- JQ
Full Speed Games. Coming soon.

Share this post


Link to post
Share on other sites
"you need to use the 5 or 6 most significant bits, not the least significant bits like you''re doing right now"

Yep, that was a big problem. I understood the hex relationship, but I was getting the wrong bits out of each color. Unfortunately, the spanning is still messed up, just as it was with mine (except now the colors are right ). To fix it, I had to do this..
*Des = ( (((*Src)     & 0x000000F8) << 8) |
(((*Src) & 0x0000FC00) >> 5) |
(((*Src) & 0x00F80000) >> 19)|
(((*(Src+1)) & 0x000000F8) << 24)|
(((*(Src+1)) & 0x0000FC00) << 11)|
(((*(Src+1)) & 0x00F80000) >> 3));

Src+=2;
Des++;
I have no idea why. Also, this gets me 17 fps in 640x480 with DirectDraw, while drawing nothing on the screen but the text to show me the fps. I was getting about the same with the old version. I don''t think any optimizations can speed it up, because something else is slowing it down. And the only thing I can think of is DirectDraw. I can draw millions of things on my own surfaces and have plenty good fps. But as soon as I lock that surface, it slaps me in the face

Thanks again for the help.

Share this post


Link to post
Share on other sites
The thing that is slowing it down is that you''re trying to do this in realtime. Convert all your screen elements to the pixel format of the primary surface in advance, and simply copy them over without any conversion stuff.

- JQ
Full Speed Games. Coming soon.

Share this post


Link to post
Share on other sites
The game is using way too many dramatic effects to run properly in 16 bit. Doing all of the effects in 16 bit would be slower than copying a single screen to 16 bit. And if I were to do that, I wouldn''t even need a custom back surface, and would just use direct draw functions. But that''s way out of the goal for the game. There are perhaps more things transparently blended on screen than what is drawn normally.

Other than that, it is so much fun to work specifically with one bit depth. I can add effects and effects and more effects without even the tiniest thought about compatability. And with 32 bit, I even have that extra byte to store alpha maps and other cool stuffs.

I''ve gotten down 32 and 24 bit with no problems. Guess I''ll just keep at 16 until something speeds it up.

Thanks again,
-Jiia

"1 and 1 is 1 11" - tool

Share this post


Link to post
Share on other sites
Hang on... is that the backbuffer you''re trying to lock there? If so, create a system memory surface instead of using a backbuffer, and then blit to the primary surface instead of flipping. Should be loads faster.

- JQ
Full Speed Games. Coming soon.

Share this post


Link to post
Share on other sites
I''ll try it. If flipping is instant, it will be adding an extra screen copy just to hand my system->video memory transfer job to Direct Draw. I guess it all depends on how well DD does it. I''m moving around a lot of stuff in my graphics engine, but I will try this as soon as I''m done.

"1 and 1 is 1 11" - tool

Share this post


Link to post
Share on other sites
A straight copy from sysmem to vidmem is pretty fast, but if you do per-pixel operations, it''s slow as hell.
What I do for 2D stuff is just do everything in system memory, have a system memory "back" buffer the same format as the primary surface and just update that, and copy it to the primary surface at the end of the screen. It''s bound to be fast, cause that''s how they did it back in the VGA days

- JQ
Full Speed Games. Coming soon.

Share this post


Link to post
Share on other sites