Emulate FAST tripple buffering in Direct3d9

Started by
6 comments, last by SoldierOfLight 6 years, 6 months ago

Hi,

How can I emulate triple buffering in Direct3d9, so I can have vsync with high fps.

DWM does it in window mode; why has nobody done it in fullscreen mode?

Advertisement

There's no reason you cant have vsync and high fps. 

Can you explain your current setup/configuration of the swap chain, present mode, etc, what you expect to happen and what is happening? 

Triple buffering doesn't improve framerates in general; it just allows it to smooth out a jittering framerate / absorb occasional framerate spikes. 

Do you know what your current GPU time per frame is, and CPU time per frame not including Present?

Can you also explain what do you mean by emulating triple buffering? Either your rendering occasionally halts to wait for the previous frame to be presented in full before reusing its buffer or it doesn't wait because it writes to a different buffer, there's no middle ground.

Omae Wa Mou Shindeiru

Present is blocking 10x longer than if it were in immediate mode.  D3DPRESENT_DONOTWAIT flag doesnt work either.

It isn't jittery, just latency by present blocking.

Default render target and swap chain.

If your GPU time is normally 1.66ms and you turn on vsync with a 60Hz display, your GPU frame time will get rounded up to 16.6ms and so the CPU will start getting too far ahead of the GPU, so D3D will block the CPU inside present for 15ms more than it used to (about 10x more). 

What you're describing could be normal/expected. 

Can you post some timings, your swap chain configuration, and what you expect to happen? 

I expect double-buffering from vsync where it just blocks.  The code acts exactly as its supposed to.  That's why i want a solution to the normal broken design.

I want true tripler-buffering vsync, where it doesn't block but copies to another buffer, so I retain high fps.  For example Windows 10 via DWM borderless fullscreen has tripler buffering vsync with high fps, but its still not  as fast as true triple buffering would be in exclusive mode.

The solution you're looking for is difficult to build. What you want, is that every VSync you decide which frame to scan out, based on what is the most recently completed frame. The way that DWM accomplishes this is that every VSync, they wake up, look at what's most recently completed, and then copy/compose it into another surface and schedule it to be scanned out on the next VSync. This adds an extra copy and an extra frame of latency.

Trying to remove the extra frame of latency is possible if you wake up *before* the VSync instead of after, with enough time buffered to schedule the copy and have it complete right before the VSync. As it turns out, this is pretty difficult. Now that we've published some implementation details of Windows Mixed Reality via PresentMon, I can tell you that this is pretty much how it works, and it's very complicated.

Trying to remove the copy is also very difficult, because now not only do you need to decide what to flip based on what's completed, but now you need to decide what to render to based on when previous rendering completed, which means that you can't get any CPU/GPU parallelism or frame queueing. If you just render to the resources in-order, eventually rendering will block because you'll be trying to render to the on-screen surface. Using a copy here prevents this.

Note that I think NVIDIA does have an implementation of this, called Fast Sync, that they've implemented in hardware and their driver. I don't really have any technical details on how they made it work, but I have to imagine it's pretty complicated as well.

This topic is closed to new replies.

Advertisement