#### Archived

This topic is now archived and is closed to further replies.

# Optimization Question

This topic is 6935 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

No matter how much I hate to ask... I have a problem with optimization. I have written my own bitmap drawing functions, and when I draw about 4 or 5 enemies on the screen it takes the framerate down a LOT. Talking from 86 FPS to about 40... I was wondering if anybody thinks there''s anything I can do to optimize this function:
// draws a bitmap with no clipping
void DrawBitmap(Bitmap* bitmap, const int x, const int y, RECT* srcRect)
{
short* picBuffer = bitmap->bits;  // the buffer for the bitmap data
int srcWidth = srcRect->right - srcRect->left,  // width of the src rectangle
srcHeight = srcRect->bottom - srcRect->top;  // height of the src rectangle

// point videoBuffer to the dest coords on the surface
videoBuffer = ((short*)desc.lpSurface) + (x + (y * pixelLPitch));
// assign the starting position in the bitmap
picBuffer += srcRect->left + (srcRect->top * bitmap->width);
// now the buffers point to the correct memory

// for every line
for(int height = 0; height < srcHeight; height++)
{
// for every pixel
for(int width = 0; width < srcWidth; width++)
// if the pixel isn''t transparent, then plot it
if(picBuffer[width] != transparentColor)
videoBuffer[width] = picBuffer[width];
// advance the pointers to the next line
videoBuffer += pixelLPitch;
picBuffer += bitmap->width;
}
} // end DrawBitmap

I have a clipping version too, but it''s the same with some clipping stuff beforehand... Thanks! ------------------------------ Jonathan Little invader@hushmail.com http://www.crosswinds.net/~uselessknowledge

##### Share on other sites
Do while loops are a little bit faster than for loops. And if any of your enemies are not transparent then you should make another function for that case.

*** Triality ***

##### Share on other sites
Everything I''m drawing is transparent, so I don''t need a non-transparent function. And, instead of code structure stuff (while vs. for loops) I was thinking of something in the "algorithm" itself. I doubt anything is there to be optimized, but I''m just thinking it''s weird, because when I turn on compiler optimizations, it goes back up to 86 FPS, so it seems the compiler can find something...

##### Share on other sites
That algorithm is O(n^2), because you have a loop nested within another loop. Very slow.

Usually you''d do something like this:
DWORD ScreenLine = Top * ScreenPitch;DWORD PicLine = 0;DWORD PicDelta = (Right - Left) * (ScreenBits >> 3);for( Y = Top; Y < Bottom; Y ++ ){    memcpy( &VideoBuffer[ScreenLine + Left], &PicBuffer[PicLine], PicDelta );    ScreenLine += ScreenPitch;    PicLine += PicDelta;}

That''s as fast as DirectDraw''s BltFast, but it doesn''t allow for transparency. You''d have to write an assembly blitter and optimize it (that''s a good idea even for opaque sprites). I''m not sure how the transparency code would be done, though.

~CGameProgrammer( );

##### Share on other sites
CGameProgrammer, it''s not n^2 unless you define n to be the average width of an image. It''s actually linear in the total size of the bitmap ( width*height, which is where the double loop comes from ).
Speeding it up - the only thing I could think of is to RLE the transparency data so you can skip large parts..

#pragma DWIM // Do What I Mean!
**I use Software Mode**

##### Share on other sites
is picBuffer coming from video memory? you could be stalling the video card on the reads, or maybe shootin'' too much junk over the bus (from video ram back to the CPU, when you do "picBuffer[width]").

##### Share on other sites
Honestly, I doubt it has ANYTHING to do with memory. It''s the fact that your innermost loop contains an ''if''. Remember, ifs mean branch instructions, which means stalling and flushing the pipeline. Putting one in a loop to be executed that many times is a recipe for slowdown. And if the compiler can unroll the loop and schedule things a little better, it would explain the change in a release build.

As for fixing it... You may want to see if there''s some way to avoid the if by using masking operations that will achieve the same end result. It all depends on how your colors are stored.

-Brian

##### Share on other sites
Use Blt, there''s no better way.
Also surfaces in video memory with blt are very fast, since it''s all accelerated and the CPU does virtual squat.

The_Minister

##### Share on other sites
quote:
Original post by The_Minister

Use Blt, there''s no better way.

If by the best way, you mean the fastest, you are wrong. Compiled sprites still whoop the crap out of blits. Basically, what you do is write a program that takes your sprites and turns them into code. Then you just call that function to display them. Technically, it runs in O(1) time. RLE has been said to run faster than Blits also, and doesn''t take precomputation. But then we are into linear with respect to non-transparent pixels.
However, if by best you mean easiest and laziest way, then you are correct.

Mike

##### Share on other sites
I somehow doubt that you could get straight CPU-only asm code faster than a hardware supported Alpha blit myself...

Did some fast coding once upon a time, and found it sucked. Using other people''s stuff is much more efficient :-).

So basically: if you have HW support for your transparent blits ( or alpha blits ) use that, if you don''t, use a highly optimised ASM version for the software side of things.

#pragma DWIM // Do What I Mean!
**I use Software Mode**

• 15
• 13
• 35
• 39