Jump to content
  • Advertisement
Sign in to follow this  
Wicked Ewok

Best way to Vsync on older cards

This topic is 4745 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi there, I've been tuning my game on various cards, including Intel's Extreme Graphics, a GF3 and a gf6800GT. I just recently reverted to DX8 and what I have found is that the quality of Vsync differs from Dx8 to Dx9. That's just the start of a bigger kicker, but let me explain a little more. I'm mainly curious about tuning quality of my Vsync and trying to prevent tearing in applications that run on older cards. The main thing is that using SWAPEFFECT_COPYVSYNC (windowed mode) really chops up the game, and the other solution, to test for my own VBlanks doesn't work very well due to a latency between checking for the VBlank and actually Presenting in older cards. In Dx9, they seem to be using a better resolution timer to swap screens, as documented by the Dx9 on the present parameter upgrades over Dx8. You notice this in Windowed mode, where SWAPEFFECT_COPYVSYNC was the only thing you can do in DX8 to let the API vsync your application for you. When using SWAPEFFECT_COPYVSYNC in Dx8, my game runs at 60FPS, except it feels chunky, whereas using the Dx9 vsync it doesn't feel choppy at the same 60FPS. To fix this, I use the GetRasterStatus function and test for my own scan line blank before calling Present(). On my 6800GT and Dx8 this works perfectly. The 60FPS I get does not feel choppy as it did when I let the API use SWAPEFFECT_COPYVSYNC. Next I tested this on a Geforce3. To my surprise, I get tearing on the upper portion of the screen. I tested a little more and it seems that this tearing is a result of the latency between catching a VBlank and actually presenting to the screen. When testing for maxscanline - 50 instead before presenting, the tearing moves upward, until it disappears, wrapping around to the task bar. My question is, how do you get a smooth vertical sync action, in windowed mode using Dx8 on older cards? SWAPEFFECT_COPYVSYNC is horrible. Testing for my own scanline results in smooth framerate but a useless VSync since there's tearing. I've been trying to figure this out for a week now. Any help is greatly appreciated. Thanks a bunch! Best, Marvin

Share this post


Link to post
Share on other sites
Advertisement
I would have thought one wouldn't need to mess with GetRasterStatus to get vsync to work, regardless of the version of DirectX, or the age of the card (within reason).

I can only assume you are confirming the actual framerates you are getting by displaying your FPS. You say:
Quote:

The main thing is that using SWAPEFFECT_COPYVSYNC (windowed mode) really chops up the game... ...When using SWAPEFFECT_COPYVSYNC in Dx8, my game runs at 60FPS, except it feels chunky, whereas using the Dx9 vsync it doesn't feel choppy at the same 60FPS.

It could be that SWAPEFFECT_COPYVSYNC is significantly slower. So much so that your're rendering isn't finished before each vertical refresh, and thus it must wait for the next one. You could be going from 60 right down to 30 because of this.

I notice that the Dx9 docs have no mention of D3DSWAPEFFECT_COPY_VSYNC. Have you considered using a combination of D3DSWAPEFFECT_FLIP and D3DPRESENT_INTERVAL_ONE in any event (I know it mentions in the docs about it being implemented as a copy in windowed mode)?

Edit: Even better, see below...

Share this post


Link to post
Share on other sites
Actually, even better would be D3DSWAPEFFECT_DISCARD. From the Dx8.1 docs on MSDN:

Quote:

When a swap chain is created with a swap effect of D3DSWAPEFFECT_FLIP, D3DSWAPEFFECT_COPY or D3DSWAPEFFECT_COPY_VSYNC, the runtime will guarantee that a IDirect3DDevice8::Present operation will not affect the content of any of the back buffers. Unfortunately, meeting this guarantee can involve substantial video memory or processing overheads, especially when implementing flip semantics for a windowed swap chain or...

In otherwords, D3DSWAPEFFECT_FLIP is probably the worst for windowed mode.

Quote:

D3DSWAPEFFECT_DISCARD swap effect to avoid these overheads and to enable the display driver to select the most efficient presentation technique for the swap chain... ...An application that uses this swap effect cannot make any assumptions about the contents of a discarded back buffer and should therefore update an entire back buffer before invoking a Present operation that would display it...


Use in combination with D3DPRESENT_INTERVAL_ONE for vsync, or D3DPRESENT_INTERVAL_IMMEDIATE for no vsync.

Share this post


Link to post
Share on other sites
Are you waiting for the actual window's final scanline or the end of the screen? If not then that's a quick fix to improve matters unless the window is maximized.

Still, I can't see any guaranteed ways of determining when the actual back buffer copy happens. Perhaps it's more likely to occur near the end of the Present call than towards the beginning, and if so you could try progressively adjusting the number of scanlines early you call present by measuring if the current scanline has exceeded the window's upper edge when present returns.

It probably won't work but it may be worth a shot..

Share this post


Link to post
Share on other sites
Hi guys thanks for your input,

Hi Ro_Akira: D3DPRESENT_INTERVAL_ONE has no effect on windowed mode in DirectX 8. The one and only way to vsync in Dx8 is to use copy_vsync, and with it on, I sometimes even get framerates over 60 and sometimes below. There's a lot missing the scanline or something. In an old "Dx9 changes over dx8" documentation from microsoft, they mentioned taking out copy_vysnc and improving the timer resolution of vsyncing.

Hey Doynax: you are correct on all marks, testing for a scanline just before the vblank will cause the latency to time correctly right at the Vblank. But this works differently for each card. My GF6800 almost has no latency between testing for VBlank and finishing the copy to front buffer using present. My GF3 however, experiences a lot of latency. It is also true that it might not be possible to ever have 0 latency because windows might not allocate enough thread time slices for present() to finish quickly enough to hit the Vertical blank scan. Unless this is done through hardware(the API), it is never going to be a gaurantee:(

I looked around more on the Web and there seems to be a concensus that SWAPEFFECT_COPY_VSYNC gives a bad Vsync quality, and there seems to be no resolution for Dx8 in windowed mode. Right now I'm creating my 800x600 window in the center of the screen, so any latency is not introduced in the window as long as the user has a resolution higher than 800x600 or a really bad latency that takes present more than half the screen to fill up before the copy actually occurs. I'm still fishing for a real solution, but I'm not sure I'll get one. Thanks a lot.

Best,
Marv

Share this post


Link to post
Share on other sites
Quote:
Original post by Wicked Ewok
Hey Doynax: you are correct on all marks, testing for a scanline just before the vblank will cause the latency to time correctly right at the Vblank. But this works differently for each card. My GF6800 almost has no latency between testing for VBlank and finishing the copy to front buffer using present. My GF3 however, experiences a lot of latency.
Hm.. I think you may have missunderstood my second remark.
Anyway, my point was that *maybe* the actual copying is more likely to occur during the end of the Present call than towards the beginning. And thus maybe you should try synchronizing the call's return to the end of vblank rather than it's entry and the beginning of vblank.

Just an unconventional idea..

Share this post


Link to post
Share on other sites
Choppy framerates can come from a few places.

Drawing without VSYNC, and outpacing the GPU. In this case, you quickly buffer 3 frames, rendering stalls until the queue empties a bit. You draw 2 frames, stall again, and repeat. Here your frames are calculated for almost the same time, then there's a gap while the GPU catches up, then you submit another couple frames calculated with almost the same time again.

One way to avoid this is to lock something the GPU is using... the backbuffer. Draw everything, take advantage of parallelism by doing all your new frame calculations while drawing occur, lock the backbuffer to ensure the drawing commands from the last frame are all flushed, present, repeat. You may even be able to push the lock after the present. This should ensure you can get 1 frame ahead in drawing, but not two.

I think both nVidia and ATI mentioned this technique a few years back in their whitepapers. Hope it's useful.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!