[DX12] Fences and swap chain Present.

Started by
5 comments, last by Dingleberry 8 years ago

Hi,

I am looking at the most basic Microsoft's samples for DX12 and I was wondering about the way they use fences for forcing a wait after the call to "Present".

Isn't "Present" already doing that for us?

I mean, if not, why do we have to pass in a command queue when we create the swap chain?


// ...

// Present the frame.
ThrowIfFailed(m_swapChain->Present(0, 0));

MoveToNextFrame();

// ...

void D3D12Fullscreen::MoveToNextFrame()
{
	// Schedule a Signal command in the queue.
	const UINT64 currentFenceValue = m_fenceValues[m_frameIndex];
	ThrowIfFailed(m_commandQueue->Signal(m_fence.Get(), currentFenceValue));

	// Update the frame index.
	m_frameIndex = m_swapChain->GetCurrentBackBufferIndex();

	// If the next frame is not ready to be rendered yet, wait until it is ready.
	if (m_fence->GetCompletedValue() < m_fenceValues[m_frameIndex])
	{
		ThrowIfFailed(m_fence->SetEventOnCompletion(m_fenceValues[m_frameIndex], m_fenceEvent));
		WaitForSingleObjectEx(m_fenceEvent, INFINITE, FALSE);
	}

	// Set the fence value for the next frame.
	m_fenceValues[m_frameIndex] = currentFenceValue + 1;
}

Maybe I missed something but I placed a breakpoint in the "if (m_fenceCompletedValue < m_fenceValues[m_frameIndex])" and never reached it presumably because "Present" already blocked until a buffer was ready.

Thanks & Cheers,

Shnoutz

Advertisement

Ok, I'm wrong, the fence is required. If I remove it I get a ton of error messages.

But my question still stand, why do we need the fence AND specify a command queue to the swap chain?


ThrowIfFailed(factory->CreateSwapChainForHwnd(
	m_commandQueue.Get(),		// Swap chain needs the queue so that it can force a flush on it.
	Win32Application::GetHwnd(),
	&swapChainDesc,
	nullptr,
	nullptr,
	&swapChain
	));

The fence is for your command lists -- you can't reset/reuse a command list if it's sitting in a command queue, because that would screw it up by the time the gpu gets to it.

Your swap chain's command queue which consumes your command lists is being throttled by presentations. Your command lists presumably are writing to a swap chain texture or else you're not going to see anything, and that texture might be in use or already written to and waiting to be used.

Therefore, the fences aren't really for the swap chain, they're for your command lists, which just happen to be synchronized by the swap chain if they're using that command queue because they need access to the swap chain's texture resource.

If you're using a different queue not associated with a swap chain, that one wouldn't get synchronized by presents. You'd still need a fence of some sort though assuming you're reusing command lists.

Note that you can avoid the usage of a fence with a waitable swap chain because if a previous frame's texture is accessible, it should stand to reason that the command list operating on it is also finished.

Thanks for the answer :)

I am mostly curious about that last point you mentioned:

Note that you can avoid the usage of a fence with a waitable swap chain because if a previous frame's texture is accessible, it should stand to reason that the command list operating on it is also finished.

Why does the swap chain need to be waitable for this to work?

They're actually really cool. https://software.intel.com/en-us/articles/sample-application-for-direct3d-12-flip-model-swap-chains

.

Conceptually, the waitable object can be thought of as a semaphore which is initialized to the Maximum Frame Latency, and signaled whenever a present is removed from the Present Queue. If an application waits for the semaphore to be signalled before rendering then the present queue is not full (so Present will not block), and the latency is eliminated.

You're sacrificing throughput because you could theoretically be writing to a command list while you're instead waiting for the next present to be ready, but on the other hand, you're not generating a super early frame from player input that's going to sit around for a while. But if you know that the next present has no chance of blocking the command list writing to its texture must be finished.

Instead of blocking you could alternatively do some compute task that's not dependent on user input and doesn't need to write to the swap chain.

If you use a waitable swap chain in place of fences, you'll be limited to half the screen refresh rate. Unless you are uncapping fps (which i believe would give you tears in your screen, haven't done it myself though so can't speak from experience), the present waits for the screen to refresh before displaying the next frame. Once you finish waiting for present, the screen starts refreshing again, but present has no new frame to flip to since you are working on building it. So one refresh cycle you would display the frame, the next you would be filling in the frame. if you don't mind fps being half refresh rates (fps would max at 30 if monitor refresh rate is 60) that's not a problem, but you are wasting a lot of time you could be using to fill in the next frame (i guess it would only be wasting if you had a lot of gpu work that needed to be done to build the next frame).

Just actually read all of what Dingleberry said and realized he already mentioned what i just said above. I'll leave this here because it explains it in a little more detail though

You won't be halving your frame rate necessarily, you just have less wiggle room to absorb spikes.

https://software.intel.com/sites/default/files/managed/cd/fb/1_gamemode.png

The yellow blocks are the part where you're blocking on waitforsingleobject, or alternatively doing some other work that isn't related to using the buffer in the top of the same column. But there's still a perfect stream of presented frames because in the above example there's three swap chain buffers.

This topic is closed to new replies.

Advertisement