Jump to content
  • Advertisement
Sign in to follow this  
Holy Fuzz

How come changing DXGI_SWAP_CHAIN_DESC.BufferCount has no effect?

This topic is 910 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

So I've been trying to implement triple-buffering in my application by changing the BufferCount* parameter of DXGI_SWAP_CHAIN_DESC, but regardless of what I set it to, there is no detectable change in the performance or latency of my application. Let me elaborate...

 

I would expect that increasing the number of swap chain buffers would lead to an increase in latency. So I started experimenting: First, I added a 50ms sleep to every frame so as to artificially limit the FPS to about 20. Then I tried setting BufferCount to 1, 2, 4, 8, and 16 (the highest it would go without crashing) and tested latency by moving my game's camera. With a BufferCount of 1 and an FPS of ~19, my game was choppy but otherwise had low latency. Now, with a BufferCount of 16 I would expect 16 frames of latency, which at ~19 FPS is almost a whole second of lag. Certainly this should be noticeable just moving the game camera, but there was no more latency than there was with a BufferCount of 1. (And none of the other values I tried had any effect either.)

 

Another possibly-related thing that's confusing me: I read that with vsync on (and no triple-buffering), the FPS should be locked to an integer divisor of your monitor's refresh rate (i.e., 60, 30, 20, 15, etc...) since any frame that takes longer than a vertical blank needs to wait until the next one before being presented. And indeed, when I give Present a SyncInterval of 1, my FPS is capped at 60. But my FPS does *not* drop to 30 once a frame takes longer than 1/60 of a second as I would expect; if I get about 48 FPS with vsync off then I still get about 48 FPS with vsync on. (And no, this isn't a result of averaging of frame times. I'm recording individual frame times and they're all very stable at around 1/48 second. I've also checked my GPU settings for any kind of adaptive vsync but couldn't find any.)

 

More details:

I'm testing this in (I think exclusive) fullscreen, though I've tested in windowed mode as well. (I've fixed all the DXGI runtime warnings about fullscreen performance issues, so I'm pretty sure I have my swap chain configured correctly)

If it matters, I'm using DXGI_SWAP_EFFECT_DISCARD (but have tested SEQUENTIAL, FLIP_SEQUENTIAL, and FLIP_DISCARD with no apparent effect).

I've tried calling Present with a SyncInterval of both 0 (no vsync) and 1 (vsync every vertical blank). Using 1 adds small but noticeable latency as one would expect, but increasing BufferCount doesn't add to it.

I've tested on three computers: One with a GTX970, one with a mobile Radeon R9 M370X, and one virtual machine running on VirtualBox. All exhibit this same behavior (or lack thereof)

 

So can anyone explain why I'm not seeing any change in latency or locking to 60/30/20/... FPS with vsync on? Am I doing something wrong? Am I not understanding how swap chains work? Is the graphics driver being too clever?

 

Thanks for your help!

 

*(As an aside, does anyone know for sure what I *should* be setting BufferCount to for double- and triple-buffering? In some places I've read that it should be set to 1 and 2 respectively for double and triple buffering, but in some other places they say set it to 2 and 3.)

Share this post


Link to post
Share on other sites
Advertisement

I could be wrong here as I am far from an expert but depending on your scene complexity the GPU may very well have enough time to draw and upload those buffers to the display. I would try limiting the FPS through scene complexity first with a standard double buffered swap chain then try it with higher buffer counts. As far as buffer counts go for double buffering you should be setting the DXGI_SWAP_CHAIN_DESC::BufferCount = 2.

Share this post


Link to post
Share on other sites

Jesse, this is by far the clearest and most informative explanation I've read on the internet or in a book on how BufferCount and MaxFrameLatency works. (And the video was useful too.) Thank you so much! If I could upvote you a thousand times, I would! I hope lots of other people find your explanation as useful as I have.

 

There are still a few things that I'm puzzled by:

 

1. As a test, I removed the trivial sleep(50) call every frame and instead looped part of my rendering code 30 times (a part that uses very little CPU but draws lots of pixels) which brings my FPS down to about 20. (90% of my frame time is now spent in Present, so I'm pretty sure I'm now GPU-limited.) Setting FrameCount to 16 had no noticeable effect, which now makes sense given your explanation (since this is GPU-limited and not vsync limited). I also tried setting MaxFrameLatency to 16, which if I understand correctly should introduce 16 frames of latency since my CPU can execute so much faster than my GPU? But again, I'm seeing no latency, which should be quite obvious at ~20 FPS, correct? Am I misunderstanding something? (I also tried PresentMon, which is reporting ~130ms of latency regardless of how I set MaxFrameLatency.)

 

2. I've been using a BufferCount of 1 in full-screen with no obvious ill-effect. Will the driver automatically increase it to 2 if I specify 1 in full-screen mode? Or maybe I'm not actually running in exclusive full-screen? (Is there any way to check that? PresentMon's CSV says "Hardware: Legacy Flip" if that's at all relevant.)

 

3. Now that I have my game GPU-limited, I am seeing my FPS locked to 60/30/20/15/etc when vsync is on. Why don't I see the same behavior when my game is CPU-limited? (And yeah, I've set MaxFrameLatency to 1.)

 

Thanks again!

Share this post


Link to post
Share on other sites

1. What swap effect are you using? Are you windowed or fullscreen? I would expect SetMaximumFrameLatency(16) to allow 16 frames of latency, though it's possible that the driver might be intervening. You should double-check your driver's control panel to make sure that relevant settings are indeed controlled by the app. You might also try using D3D11_CREATE_DEVICE_PREVENT_INTERNAL_THREADING_OPTIMIZATIONS on your device. I've confirmed that with a simple app that renders at 60fps with VSync on, I can hit 300ms of latency (according to PresentMon).

 

2. I think the DISCARD swap effect might get auto-upgraded to 2 in fullscreen. The 1-buffer fullscreen SEQUENTIAL swapchain ends up using a copy for presentation, and shows up in PresentMon as "Hardware: Legacy Copy to front buffer."

 

3. I think I need to explain this one by example. Let's start by assuming that the CPU queue is always full and you're GPU-bound.

  • If your frames take 15ms on the GPU, you can produce one frame per VSync. You render a frame, it gets queued, and flips on the next VSync. You never build up presents to become VSync-bound. For two frames, you get 15ms of work, 1ms idle, 15ms work, 1ms idle.
  • If your frames take 17ms on the GPU, it takes a little longer than a VSync to produce a frame. You render a frame, and it gets queued to flip on the next VSync. Meanwhile, the GPU is now sitting there idle, waiting for that flip to happen, because it needs to start writing to the texture that's on-screen. So for two frames you have 17ms of work, 15ms idle, 17ms of work, 15ms idle.

If you add up the work/idle time per frame, the first scenario is 16ms (60hz), the second is 33ms (30hz).

 

But now what if you can do the first 10ms of work without touching the back buffer. Those two scenarios now become:

  • Your frames take 15ms on the GPU. No change.
  • Your frames take 17ms on the GPU. Your times for a few frames are now: 17ms work frame 0, 10ms work frame 1, 5ms idle, 7ms work frame 1, 10ms work frame 2, 10ms idle, 7ms work frame 2, 10ms work frame 3...

Now scenario 2 ends up averaging 27ms per frame, or 37hz.

 

So the quantization all comes down to how long the GPU is idle waiting for a VSync, and what that does to your overall frame time. When you're CPU-bound, any time the GPU spends waiting for a buffer to come off-screen isn't affecting your overall framerate (if it was, you'd end up GPU-bound).

Share this post


Link to post
Share on other sites

Again, a very clear explanation that really helps me understand what's going on. Thanks!

 

1. I'm using DXGI_SWAP_EFFECT_DISCARD in full-screen with vsync on and have experimented with both 2 and 3 buffers. No significant difference in average latency between SetMaximumFrameLatency(1) and SetMaximumFrameLatency(16) according to PresentMon. D3D11_CREATE_DEVICE_PREVENT_INTERNAL_THREADING_OPTIMIZATIONS had no effect either. So maybe my driver is overriding this setting? There aren't many options to tweak in my AMD control panel. I can try on my other computer (a GTX 970) later.

Share this post


Link to post
Share on other sites

FYI, I have confirmed that AMD inserts a wait in their driver to enforce a maximum frame latency of 3. This is unexpected and I'll be following up with them to see what's going on. You can experiment getting a deeper queue using another GPU or WARP. It might be easier with WARP since the "GPU" is so much slower.

Share this post


Link to post
Share on other sites

FYI, I have confirmed that AMD inserts a wait in their driver to enforce a maximum frame latency of 3. This is unexpected and I'll be following up with them to see what's going on. You can experiment getting a deeper queue using another GPU or WARP. It might be easier with WARP since the "GPU" is so much slower.

 

I finally got around to testing on my GTX 970, and I can confirm that SetMaxFrameLatency(16) does indeed create the expected latency, unlike my laptop's M370X. I'm certainly curious why AMD limits it to 3.

 

Thanks again for your help!

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!