BltFast() vs Moving memory yourself

Started by
27 comments, last by Hootie 23 years, 9 months ago
I''ve created 2 little test programs. Test1 uses DX7 Bltfast() and Test2 just does a memcpy-type move. Each moves an 800x600x16 bitmap from a sysmem surface to the backbuffer then Flip()s with a nosync so that won''t disguise the blt speeds. I get a 90 fps speed increase in Test2 with a PIII/500 & GeForce II. I wonder what would happen on other cards. If anyone wants to give it a try and post your results and hardware tested I zipped the programs and a test bitmap up that can be d/l''d here: http://gameznet.com/golgotha/downloads/finaltest.zip
Advertisement
Well I tried it on my PIII 533b w/ a Voodoo3 3000 (16Mb AGP) and got 123fps with test1.exe and 159fps with test2.exe. And this is with AGP support turned off (I got dual monitor debugging going, and it doesn''t seem to like AGP w/ my second card? Odd eh?) So I''m not sure the difference w/ AGP.

But heh, your using C++ memcopy routines? Cause I got mmx going, should I try a test with that too? Actually I think I will just to see if there''s a substancial increase in speed or not. E-mail me if you want a copy of this test program...
- Ben
__________________________Mencken's Law:"For every human problem, there is a neat, simple solution; and it's always wrong."
"Computers in the future may weigh no more than 1.5 tons."- Popular Mechanics, forecasting the relentless march of science in 1949
Test2 is optimized with MMX asm code if the program detects that you have MMX. Otherwise it does a straight memcpy().
With my PII-450, TNT2 (AGP 2x enabled) running Windows 2000 Professional, I get:

Test1 - 189 fps
Test2 - 190 fps

Both frame rates would drop to 170ish if I moved the mouse (even though there wasn''t a visible pointer), making me wonder what was going on in the background.
I don't know since all any mouse movement messages would fall to the winproc default (ie DefWindowProc). I guess you'd have to ask Microsoft on that. What kind of TNT2 card do you have?

Edited by - Hootie on July 13, 2000 4:07:36 AM

Edited by - WitchLord on July 14, 2000 9:08:31 AM
Not surprising, really, that the numbers are coming out similar. Try putting the bitmap on a video memory surface, and test again. Test1 will be extremely quick, and test2 will be awfully slow.

The graphics hardware makes very little difference, since BltFast is doing pretty much the same as memcpy.

TheTwistedOne
http://www.angrycake.com
TheTwistedOnehttp://www.angrycake.com
I'm well aware of hardware accel in vidmem. This test is to see the difference between BltFast() and plain old memory moves when dealing with sysmem-based source surfaces. In many cases the mem move is quite a bit faster (25 - 90%). Especially with high-end video cards on faster machines.

Many would believe BltFast() (the supposedly highest optimized blit function of DDraw) would be equal or faster than a straightout memory move. The reality, in many cases, is that it's not.


>The graphics hardware makes very little difference, since BltFast is doing pretty much the same as memcpy.

On the contrary, the hardware makes a big difference. Here's the results of a test with a very fast PIII & Geforce 256:

Test1: 157 FPS
Test2: 285 FPS
PIII 800
128MB RAM
ASUS AGP V6800 Deluxe geForce 256 DDR 32MB
Win 98



Edited by - Hootie on July 13, 2000 6:37:02 AM
You also must take into account the fact that BltFast uses hardware acceleration when supported (but what card doesnt support it nowadays?)

It wont make a difference in your test program I''m usre because it sounds like all you''re doing is blitting an image, so you can use the CPU. But in a real game yu might want that extra CPU power for AI and other shite so you would wantthe gfx card to take care of the blitting.

One more point to make, BltFast also does transparency whereas memcpy does not. I''ve actually tried all these tests myself, I dont remember the results but I know I ended up using DX''s blitting routines.


------------------------------
fclose(fp)
------------------------------
ByteMe95::~ByteMe95()My S(h)ite
My TNT2 is a Leadtek Winfast S320 II or something equally unmemorable. It has 32 Mb of RAM (hence I tend to assume that everything will be in video memory, in which blitfast is really hard to beat). I upped the clock speed on it a bit, but its still nowhere near the speed of a TNT2 Ultra. Its fine for most of what I do.
You know, you do have to use sysmem for a bunch of stuff sometimes (I know, I do, i have a 16mb card, goes fast at 1024x768x16bpp), and sometimes it''s life or death whether you do your full-screen blits really really fast or not. And on my measly P-233mmx it was over a 10% difference. PS Could I see the code for that? Compare to my routine...

This topic is closed to new replies.

Advertisement