Sign in to follow this  
Aardvajk

DirectDraw7 Flip very slow

Recommended Posts

Still banging on about this I'm afraid :) Bit of new info. I have old DirectDraw7 code that runs fast on a 98 machine but crawls on XP and 2000. After extensive tests I have discovered that it is the Flip operation that seems to be the bottleneck. Same with Direct3D7. Any ideas why this might be and how to correct it?

Share this post


Link to post
Share on other sites
It might help if you posted your initialization code and your render/flip code so that we can see how things are set up and what parameters are being used.

Share this post


Link to post
Share on other sites
1) as has already been asked, how much slower is it on XP/2000?

a. knowing either the miliseconds-per-frame or frames-per-second you're seeing on 98, 2000, and XP would be handy for more suggestions.

b. one suspicion would be that the Flip()'s on the 98 machine aren't synchronised to the monitor refresh rate but those on the 2000 and XP machines are.
If the frames-per-second for the 2000 and XP machines are roughly the same as (or a rough lower multiple of) the monitor refresh rate set for those machines (e.g. 74.8fps & 75Hz, or 30fps & 60Hz), and the frames-per-second on the 98 machine is much higher than the monitor refresh rate (e.g. 400fps & 75Hz), then this is very likely to be the problem.

c. When Flip() is waiting for VSYNC, it will appear to take a long time when profiled - though it's [usually] not actually doing anything other than spinning in a loop checking a flag [a h/w register or s/w flag set by an interrupt handler].

d. if your Flip() is being synchronised/locked to the monitor refresh rate/VSYNC, unfortunately it isn't always entirely under your control: although you can specify flags to DirectDraw to tell it not to wait for VSYNC, some graphics card drivers have user options to override the wishes of your application/Direct3D/DirectDraw. Look deep in all the options of the Display properties for each of your test machines.


2) Which versions of DirectX do all of the machines have installed?, and do some have the Debug/SDK versions of DirectX active/installed whilst other have the Retail version?

a. If the Debug versions of DirectDraw or Direct3D are in use on the XP or 2000 machines, and the 98 has the Retail version, then that could account for some performance difference since the debug versions of the runtimes perform a whole lot more validation to make sure the API is being used properly.

b. On a related note, if your code is abusing the DirectDraw/Direct3D API in any way, then while the code may work on all of your test machines, the debug versions of the DirectDraw and D3D runtimes will complain lots whereas the Retail ones won't complain.

b. Another difference is the Debug Output Level sliders for all DirectX components (even seemingly unrelated ones). The 2000/XP machines may be outputting tons of information to the system debugger stream - which is slooow.


3) As well as the 8-bit emulation already mentioned above, 24-bit modes (rather than 32-bit ones) may also be particularly slow on some video cards too (most modern cards don't support 24-bit frame buffers natively)

Share this post


Link to post
Share on other sites
Initialisation code: (called with 640x480 and 32bit depth)


bool CDirectDraw::Acquire(HWND Hw,int W,int H,int D)
{
DDSURFACEDESC2 Ds;
DDSCAPS2 Dc;

DirectDrawCreateEx(NULL,(void**)&Draw,IID_IDirectDraw7,NULL);
Draw->SetCooperativeLevel(Hw,DDSCL_EXCLUSIVE|DDSCL_FULLSCREEN);
Draw->SetDisplayMode(640,480,32,0,0);

ZeroMemory(&Ds,sizeof(Ds));
Ds.dwSize=sizeof(Ds);
Ds.dwFlags=DDSD_CAPS|DDSD_BACKBUFFERCOUNT;
Ds.ddsCaps.dwCaps=DDSCAPS_PRIMARYSURFACE|DDSCAPS_FLIP|DDSCAPS_COMPLEX;
Ds.dwBackBufferCount=2;

if(Draw->CreateSurface(&Ds,&Prim,NULL)!=DD_OK) return 0;

ZeroMemory(&Dc,sizeof(Dc));
Dc.dwCaps=DDSCAPS_BACKBUFFER;
if(Prim->GetAttachedSurface(&Dc,&Back)!=DD_OK) return 0;

return true;
}


Flip code


void CDirectDraw::Flip()
{
while(Prim->Flip(NULL,0)==DDERR_WASSTILLDRAWING);
}


Have also tried


void CDirectDraw::Flip()
{
Prim->Flip(NULL,DDFLIP_WAIT);
}


On the newer PCs and OSs, if I blit one 32x32 sprite with not even clearing background it runs really really slow but if I blit loads of surfaces but only flip every other frame it runs twice as fast (although obviously graphics look jerky), which is why I assume it is the flip that is slowing things down. Have the same problem using Direct3D (older version where you use DirectDraw surfaces and still call DD flip).

Appreciate all the comments about the refresh rate - sounds very plausibly the problem. Is there any kind of solution if so or is my code above just wrong in some way?

Thanks

Share this post


Link to post
Share on other sites
Sorry - just noticed I've got the backbuffercount set to 2 in the code above. Just tried it at 1 and doesn't make any difference. Sob.

Looked everywhere through the display properties on XP machine and can't find anything about synching with refresh rates.

I don't THINK I'm abusing the API in any way. Apart from the code above and some very standard initialising of surfaces, the only other DirectDraw code is calls to BltFast with or without SRCCOLORKEY flags for transparency.

My current test program on XP loads one 32x32 surface and blits it onto the back buffer without even clearing the buffer. Runs as slowly as the full blown platform game with hundreds of blits. Still convinced it is the flip. Help!

Share this post


Link to post
Share on other sites
dont use flip unless you really need it. just blit the back buffer to the primary surface instead. it should improve the speed quite a bit.

Share this post


Link to post
Share on other sites
twkr - appreciate the good idea but have just tried it and I get a similar effect to when I turn hardware acceleration off. Does get much much faster but the graphics get very blurry. I'm assuming this is because the copy is not synched with the vertical retrace although may be wrong.

Also, for some reason, when I do this the program won't shut down properly until I press Ctrl+Alt+Del. Bit weird.

This has at least confirmed to me that the Flip is what is slowing everything down though so cheers.

Share this post


Link to post
Share on other sites
You still haven't posted any hard profiling statistics [wink]

Quote:
Original post by EasilyConfused
Appreciate all the comments about the refresh rate - sounds very plausibly the problem. Is there any kind of solution if so or is my code above just wrong in some way?
As Simon mentioned - the synchronization flags are more hints than requirements. It's very common for drivers to allow these to be overriden by users - in which case there's nothing you can do [oh]

Another thought - are you doing frame-based animation/movement or time-based animation/movement?

I've come across older DDraw code before that uses frame-based animation (that is, all sprites move 1 unit each time a new frame comes around) that makes it very easy to get different performance characteristics on different machines. Time-based animation/modelling has it's problems, but this particular one shouldn't be an issue.

hth
Jack

Share this post


Link to post
Share on other sites
What's the easiest way to time your framerate accurately? Happy to post some stats if someone can give me a couple of lines of code to do it. time.h?

However, have just tried passing DDFLIP_NOVSYNC to Flip and guess what - speeds up enormously but graphics go all shuddery again, so at least I now know exactly what the problem is.

Very confused now. How on earth do I get my nice solid-moving graphics back?

Must admit the code does use pixels-per-frame movement but surely if I move sprites based on time instead, the result will be really jerky on the hardware that is causing problems?

I'm going back to writing text adventures at this rate (all together now - probably a good idea based on your understanding of graphics code...)

Share this post


Link to post
Share on other sites
Quote:
Original post by EasilyConfused
What's the easiest way to time your framerate accurately? Happy to post some stats if someone can give me a couple of lines of code to do it. time.h?


Download a trial version of Fraps: http://www.fraps.com/ [smile]


Quote:
However, have just tried passing DDFLIP_NOVSYNC to Flip and guess what - speeds up enormously but graphics go all shuddery again, so at least I now know exactly what the problem is.

Very confused now. How on earth do I get my nice solid-moving graphics back?

Must admit the code does use pixels-per-frame movement but surely if I move sprites based on time instead, the result will be really jerky on the hardware that is causing problems?


Synopsis and thoughts:

1) when Flip() waits for VSYNC, it caps the rate of your render loop at the frequency of your monitors vsync/refresh (or the next lowest multiple). So for example, with a 60Hz display mode, your loop will run at most 60 times a second (and if it can't run at that speed it'll run at 30 times a second, and so on).


2) when Flip() waits for VSYNC, if the stuff you're drawing isn't stressing the graphics card at all, your loop will always run at that rate for every frame. So if your animation code says something like "position.x += 2", then things will always move at the same rate and so look smooth, because you 'know' the x position will always move by 2 pixels every 1/60th of a second (using the example of 60Hz from above).


3) when Flip() *doesn't* wait for VSYNC, the rate your loop runs is usually determined by external factors such as the exact amount of time the graphics card took to render your scene; what other processes and threads running on the computer were up to at that time; where in your loop your thread was pre-empted by the thread scheduler; which window messages were recieved by your message pump; etc.

This combination of external things is as good as random over time; because of this, if you aren't locking your loop to VSYNC, the rate at which your loop makes an iteration and the time it takes Flip() to return will vary for each iteration of the loop. So your "position.x += 2" still moves by 2 pixels every loop iteration, but the time between loop iterations varies, thus you get jerky animation...


4) Another artifact you see when you don't wait for VYSNC is tearing - where the monitor/TV hasn't finished updating the display when you display the new frame - you get a tear at the location of the raster (aka scanline) where the bottom half of the display contains the previous frame and the top half contains the current frame. That's kinda what waiting for VSYNC is intended to prevent - if you make the new frame visible when the raster is off screen (the vertical blanking interval/gap), you don't see any tearing: http://en.wikipedia.org/wiki/VBI


5) Since you say the animation on the 98 machine is smooth, just faster, I'd be interested in knowing what frame rate you get on each of the machines (possibly using Fraps - or simply timing how long it takes between iterations of your loop).

It could be the case that *all* of the machines are synchronised to VSYNC, but all set to different refresh rates (e.g. 85Hz for the 98 machine and 60Hz for the 2000 and XP machines).


6) I'd definitely 100% advise using time based animation rather than assuming your loop will always iterate at the same rate, that way you should get smooth motion regardless of the frame rate - even if it isn't synchronised to VSYNC.


7) I though of something entirely unrelated for us to look at: Your window message handling loop (PeekMessage/GetMessage, etc), the window procedure where you react to those messages, and how they fit in with iterations of your loop.

There are subtle differences in the way the message queues work on 9x platforms (95/98/ME) compared to NT platforms (NT/2000/XP).

Try specifying DDFLIP_NOVSYNC on the 98 machine too - if that runs more smoothly than on the 2000/XP machines, then it could be an indication of your render loop not being called from the most appropriate place or using a blocking call to fetch messages.


Quote:
I'm going back to writing text adventures at this rate (all together now - probably a good idea based on your understanding of graphics code...)


Nah, stick at it - nobody was born with knowledge of graphics programming - we all had to learn this stuff at some point or other in our programming histories.

Share this post


Link to post
Share on other sites
SC1A - thanks for your useful (and last very kind) comments. Will have a look at fraps when I get a chance.

WinMain message loop is as follows (Engine.OnCycle is where the rendering and flip is called)


WinMain()
...
/* Initialisation */
...
while(true)
{
if(PeekMessage(&Msg,NULL,0,0,PM_REMOVE))
{
if(Msg.message==WM_QUIT) break;

TranslateMessage(&Msg);
DispatchMessage(&Msg);
}

Engine.OnCycle();
}
...
/* Cleanup etc */
...
}


Messages are then handled as follows (using windowsx.h crackers)


void OnDestroy(HWND Hw)
{
PostQuitMessage(0);
}

void OnKey(HWND Hw,UINT Code,bool Down,int Repeat,UINT Flags)
{
Key[Code]=Down;
}

LRESULT CALLBACK WndProc(HWND Hw,UINT Msg,WPARAM wParam,LPARAM lParam)
{
switch(Msg)
{
HANDLE_MSG(Hw,WM_DESTROY,OnDestroy);

HANDLE_MSG(Hw,WM_KEYDOWN,OnKey);
HANDLE_MSG(Hw,WM_KEYUP,OnKey);

default: return DefWindowProc(Hw,Msg,wParam,lParam);
}

return 0;
}


As an aside, the 98 and XP machines run the game stupidly fast with DDFLIP_NOVSYNC but as you have all suggested I get the tearing effect when I do this. Also for some reason my gamma fade doesn't seem to work when I do this so I assume there must be a better solution. I tried upping the refresh rate on the XP machine yesterday (ducking under the desk as I did so in case it exploded) all the way to like 100 and something but didn't speed anything up.

Thanks for all your help so far though. At least I now know what is causing the problem.

Share this post


Link to post
Share on other sites
Fraps reckons 60 fps when the monitor refresh is 60 (predictabley) and 75 fps when I put the monitor up to 120. Still running slower than on my 98 machine.

I fully understand what the problem is now but still don't get how commercial games get around this problem. Or is it just that this XP machine is rubbish for games?

Share this post


Link to post
Share on other sites
Sorry - being a bit thick. Just tried upping the animation rate and moving the sprites faster and looks lovely with hardware on and locking to vsync.

Obviously need to have a total rewrite to time-based animation and movement like you all suggested.

Guess I need to look into millisecond timers. Anybody got a good place to start?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this