Jump to content
  • Advertisement
Sign in to follow this  
nbertoa

[D3D12] Swapchain::present() Glitches

This topic is 848 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi community

 

I am experimenting some artifacts at the beginning of swap chain buffer presentation. When I move the camera, they appear and then dissapear, they are like horizontal lines (VSync problem apparently). I am pretty sure this is related with the swap chain.

 

I created this video so the flickering is more noticeable.

 

I create the swap chain in the following way:

void D3dData::CreateSwapChain(const HWND hwnd, ID3D12CommandQueue& cmdQueue) noexcept {
	IDXGISwapChain* swapChain{ nullptr };

	DXGI_SWAP_CHAIN_DESC sd = {};
	sd.BufferDesc.Width = Settings::sWindowWidth;
	sd.BufferDesc.Height = Settings::sWindowHeight;
	sd.BufferDesc.RefreshRate.Numerator = 60U;
	sd.BufferDesc.RefreshRate.Denominator = 1U;
	sd.BufferDesc.Format = Settings::sBackBufferFormat;
	sd.BufferDesc.ScanlineOrdering = DXGI_MODE_SCANLINE_ORDER_UNSPECIFIED;
	sd.BufferDesc.Scaling = DXGI_MODE_SCALING_UNSPECIFIED;
	sd.SampleDesc.Count = 1U;
	sd.SampleDesc.Quality = 0U;
	sd.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT;
	sd.BufferCount = Settings::sSwapChainBufferCount;
	sd.OutputWindow = hwnd;
	sd.Windowed = true;
	sd.SwapEffect = DXGI_SWAP_EFFECT_FLIP_DISCARD;
	sd.Flags = 0;

	// Note: Swap chain uses queue to perform flush.
	CHECK_HR(D3dData::mDxgiFactory->CreateSwapChain(&cmdQueue, &sd, &swapChain));

	CHECK_HR(swapChain->QueryInterface(IID_PPV_ARGS(D3dData::mSwapChain.GetAddressOf())));

	// Set sRGB color space
	D3dData::mSwapChain->SetColorSpace1(DXGI_COLOR_SPACE_RGB_FULL_G22_NONE_P709);
	D3dData::mSwapChain->SetFullscreenState(Settings::sFullscreen, nullptr);

	// Resize the swap chain.
	CHECK_HR(D3dData::mSwapChain->ResizeBuffers(Settings::sSwapChainBufferCount, Settings::sWindowWidth, Settings::sWindowHeight, Settings::sBackBufferFormat, 0U));
}

The number of queued frames is equal to Settings::sSwapChainBufferCount - 1. 

The Settings::sFullscreen is true.

As you can see, I work in fullscreen space, with VSync and with FLIP_DISCARD effect.  

 

If swap chain buffer count is 2, and then at presentation I set a sync interval of 0 through Present(0, 0) then I see these artifacts (like a VSync problem or something like that). But if I set a sync interval of 1 (Present(1, 0)), then these artifacts are not shown anymore. 

 

If I increment swap chain buffers, I see those artifacts, unless I use a sync interval of Settings::sSwapChainBufferCount - 1, but when I move the camera I note like a slow responsiveness. You can see it in this video

 

I tried changing to FLIP_SEQUENTIAL, windowed mode, etc but nothing worked.

 

What could be the error? I want to make sure I totally understand this because maybe I am missing something abour VSync, Swap Chain, Present, etc

 

Thanks! 

 

Share this post


Link to post
Share on other sites
Advertisement

Are you using fences to make sure that the GPU has finished each frame before you reuse that frame's resources?

Share this post


Link to post
Share on other sites

Yes, I think I am using fences correctly. To be sure about that I replaced my fences mechanism by FlushCommandQueue() (that basically does not begin to render next frame until current frame was completely executed by GPU) and I had the same problem.

Share this post


Link to post
Share on other sites

Update:

 

I was reading the following articles about swap chain, latency, frames, etc

 

https://software.intel.com/en-us/articles/sample-application-for-direct3d-12-flip-model-swap-chains#

https://msdn.microsoft.com/en-us/library/windows/apps/ms687036.aspx

https://msdn.microsoft.com/en-us/library/windows/desktop/hh404557(v=vs.85).aspx

https://developer.nvidia.com/dx12-dos-and-donts#swapchains

https://code.msdn.microsoft.com/windowsapps/DirectXLatency-sample-a2e2c9c3/sourcecode?fileId=86652&pathId=155120175

 

and I modified my code:

 

 

Swap chain creation

	void CreateSwapChain(const HWND hwnd, ID3D12CommandQueue& cmdQueue, Microsoft::WRL::ComPtr<IDXGISwapChain3>& swapChain3) noexcept {
		IDXGISwapChain1* baseSwapChain{ nullptr };

		DXGI_SWAP_CHAIN_DESC1 sd = {};
		sd.AlphaMode = DXGI_ALPHA_MODE_UNSPECIFIED;
		sd.BufferCount = Settings::sSwapChainBufferCount;
		sd.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT;
		sd.Flags = DXGI_SWAP_CHAIN_FLAG_FRAME_LATENCY_WAITABLE_OBJECT;
		sd.Format = MasterRender::BackBufferFormat();
		sd.SampleDesc.Count = 1U;
		sd.Scaling = DXGI_SCALING_NONE;
		sd.Stereo = false;
		sd.SwapEffect = DXGI_SWAP_EFFECT_FLIP_DISCARD;

		CHECK_HR(D3dData::Factory().CreateSwapChainForHwnd(&cmdQueue, hwnd, &sd, nullptr, nullptr, &baseSwapChain));
		CHECK_HR(baseSwapChain->QueryInterface(IID_PPV_ARGS(swapChain3.GetAddressOf())));

		CHECK_HR(swapChain3->ResizeBuffers(Settings::sSwapChainBufferCount, Settings::sWindowWidth, Settings::sWindowHeight, MasterRender::BackBufferFormat(), sd.Flags));

		// Set sRGB color space
		swapChain3->SetColorSpace1(DXGI_COLOR_SPACE_RGB_FULL_G22_NONE_P709);

		// Make window association
		CHECK_HR(D3dData::Factory().MakeWindowAssociation(hwnd, DXGI_MWA_NO_WINDOW_CHANGES | DXGI_MWA_NO_ALT_ENTER | DXGI_MWA_NO_PRINT_SCREEN));

		CHECK_HR(swapChain3->SetMaximumFrameLatency(Settings::sQueuedFrameCount));
	}

and SignalFenceAndPresent()

void MasterRender::SignalFenceAndPresent() noexcept {
	ASSERT(mSwapChain != nullptr);
	static const HANDLE frameLatencyWaitableObj(mSwapChain->GetFrameLatencyWaitableObject());
	const DWORD result(WaitForSingleObjectEx(frameLatencyWaitableObj, 0U, true));
	if (result == WAIT_TIMEOUT) {
 		return;
	}

	CHECK_HR(mSwapChain->Present(0U, 0U));

	// Add an instruction to the command queue to set a new fence point.  Because we 
	// are on the GPU time line, the new fence point won't be set until the GPU finishes
	// processing all the commands prior to this Signal().
	mFenceByQueuedFrameIndex[mCurrQueuedFrameIndex] = ++mCurrentFence;
	CHECK_HR(mCmdQueue->Signal(mFence, mCurrentFence));
	mCurrQueuedFrameIndex = (mCurrQueuedFrameIndex + 1U) % Settings::sQueuedFrameCount;	

	// If we executed command lists for all queued frames, then we need to wait
	// at least 1 of them to be completed, before continue generating command lists. 
	const std::uint64_t oldestFence{ mFenceByQueuedFrameIndex[mCurrQueuedFrameIndex] };
	if (mFence->GetCompletedValue() < oldestFence) {
		const HANDLE eventHandle{ CreateEventEx(nullptr, false, false, EVENT_ALL_ACCESS) };
		ASSERT(eventHandle);

		// Fire event when GPU hits current fence.  
		CHECK_HR(mFence->SetEventOnCompletion(oldestFence, eventHandle));

		// Wait until the GPU hits current fence event is fired.
		WaitForSingleObject(eventHandle, INFINITE);
		CloseHandle(eventHandle);
	}
}

that is called after we ended current frame rendering

tbb::task* MasterRender::execute() {
	while (!mTerminate) {
		mTimer.Tick();

		UpdateCamera(mCamera, mView, mProj, mTimer.DeltaTime());
		ASSERT(mCmdListProcessor->IsIdle());

		BeginFrameTask();
		MiddleFrameTask();
		EndFrameTask();

		SignalFenceAndPresent();
	}

	mCmdListProcessor->Terminate();
	FlushCommandQueue();
	return nullptr;
}

But I have the same problem. I recorded another video

 

 

 

Any ideas? I cannot understand what can be happening

Edited by nicolas.bertoa

Share this post


Link to post
Share on other sites

What happens if you change:

const DWORD result(WaitForSingleObjectEx(frameLatencyWaitableObj, 0U, true));
if (result == WAIT_TIMEOUT) {
	return;
} 

to:

WaitForSingleObject(frameLatencyWaitableObj, INFINITE); 

?

 

Specifically, it looks like you're trying to render as fast as possible and only Present once per VBlank, which was on recommendation in my video. But if your render loop doesn't use fences correctly, you could be stomping on memory using the CPU while the GPU is still reading it.

 

Also, you mentioned you're using VSync, but you're presenting with the first parameter as 0, which means you're not.

Edited by Jesse Natalie

Share this post


Link to post
Share on other sites

Hi Jesse

 

I also tried what you recommended before.

WaitForSingleObject(frameLatencyWaitableObj, INFINITE);

With/without VSync (Present(0, 0), Present(1, 0) and Present(Settings::sQueuedFrameCount, 0))

Settings::sQueuedFrameCount in my case is the swap chain buffer count - 1

Which VSync should I use? Is it correct to use the number of queued frames, or simply 1? I did not fully understand this parameter from MSDN documentation.

 

I am getting the same flickering problem with your recommendations but I just found the following. If I flush command queue instead of queuing frames (and using fences to synchronize), then flickering problem dissapears

void MasterRender::FlushCommandQueue() noexcept {
	++mCurrentFence;

	CHECK_HR(mCmdQueue->Signal(mFence, mCurrentFence));

	// Wait until the GPU has completed commands up to this fence point.
	if (mFence->GetCompletedValue() < mCurrentFence) {
		const HANDLE eventHandle{ CreateEventEx(nullptr, false, false, EVENT_ALL_ACCESS) };
		ASSERT(eventHandle);

		// Fire event when GPU hits current fence.  
		CHECK_HR(mFence->SetEventOnCompletion(mCurrentFence, eventHandle));

		// Wait until the GPU hits current fence event is fired.
		WaitForSingleObject(eventHandle, INFINITE);
		CloseHandle(eventHandle);
	}
}
void MasterRender::SignalFenceAndPresent() noexcept {
	ASSERT(mSwapChain != nullptr);
	static const HANDLE frameLatencyWaitableObj(mSwapChain->GetFrameLatencyWaitableObject());
	const DWORD result(WaitForSingleObjectEx(frameLatencyWaitableObj, 0U, true));
	if (result == WAIT_TIMEOUT) {
 		return;
	}

	CHECK_HR(mSwapChain->Present(0U, 0U));

	FlushCommandQueue();
}

But, of course, FPS are lower now. Could this indicate that I am using fences wrongly? In Debug mode, I am not getting any D3DDebugLayer error.

Share this post


Link to post
Share on other sites

The first parameter to Present() is how many VSync periods should the frame be on screen for. So 1 means 1 VSync, 2 means 2 VSyncs, etc. On a 60hz monitor, 1 implies 60fps, 2 implies 30fps, 4 implies 15fps, etc. 0 means don't wait for VSync.

 

Yes, it looks like you have a CPU/GPU synchronization issue, since forcing serialization of the CPU and GPU (via your FlushCommandQueue) solves the issue, so it does seem like you're not using them correctly. Are you waiting on or polling a fence while rendering your scene at all?

Share this post


Link to post
Share on other sites

After recording and executing commands lists I call SignalFenceAndPresent() method.

void MasterRender::SignalFenceAndPresent() noexcept {
	ASSERT(mSwapChain != nullptr);
	static const HANDLE frameLatencyWaitableObj(mSwapChain->GetFrameLatencyWaitableObject());
	const DWORD result(WaitForSingleObjectEx(frameLatencyWaitableObj, 0U, true));
	if (result == WAIT_TIMEOUT) {
 		return;
	}

	CHECK_HR(mSwapChain->Present(1U, 0U));

	// Add an instruction to the command queue to set a new fence point.  Because we 
	// are on the GPU time line, the new fence point won't be set until the GPU finishes
	// processing all the commands prior to this Signal().
	mFenceByQueuedFrameIndex[mCurrQueuedFrameIndex] = ++mCurrentFence;
	CHECK_HR(mCmdQueue->Signal(mFence, mCurrentFence));
	mCurrQueuedFrameIndex = (mCurrQueuedFrameIndex + 1U) % Settings::sQueuedFrameCount;	

	// If we executed command lists for all queued frames, then we need to wait
	// at least 1 of them to be completed, before continue generating command lists. 
	const std::uint64_t oldestFence{ mFenceByQueuedFrameIndex[mCurrQueuedFrameIndex] };
	if (mFence->GetCompletedValue() < oldestFence) {
		const HANDLE eventHandle{ CreateEventEx(nullptr, false, false, EVENT_ALL_ACCESS) };
		ASSERT(eventHandle);

		// Fire event when GPU hits current fence.  
		CHECK_HR(mFence->SetEventOnCompletion(oldestFence, eventHandle));

		// Wait until the GPU hits current fence event is fired.
		WaitForSingleObject(eventHandle, INFINITE);
		CloseHandle(eventHandle);
	}
}

Basically, I signal fence for current frame and if next frame in my frame queue is available, then I continue with next frame rendering.

Otherwise, I wait until mFence->GetCompletedValue() is greater or equal to the oldest fence I signaled.

Edited by nicolas.bertoa

Share this post


Link to post
Share on other sites

Sure, but you only signal if you actually end up presenting, which may not happen with the current code, since you still have a timeout waiting for a frame to be ready. If you hit that timeout and render again, you'll (probably) overwrite the data that the GPU is reading.

 

When the CPU writes data to an upload buffer (e.g. camera matrices), how do you know that the upload buffer isn't being used by the GPU? Do you N-buffer them with mCurrQueuedFrameIndex? Do you pool them and use a fence to see if it's ready? Or do you just have one and overwrite the data every frame? My guess is on the last one, which is not valid, since it causes a CPU/GPU race.

Share this post


Link to post
Share on other sites

Mmm I see...

 

In my application, most constant buffers are fixed (world matrices and materials) because my geometry does not change its world position or materials at runtime. So I have 1 of them shared between all the frames.  This is safe because they are only used for reading.

 

But the constant buffer that has the view and projection matrices (per frame cbuffer), is updated each frame (based on camera's changes (user input)) and that constant buffer is shared between all my frames. That is maybe why I am seeing this flickering only when I move the camera and only with geometry position (not color, or shape, etc), because while GPU is using View and Projection matrices from that buffer, another frame in CPU could be overwriting its content.

That makes sense!

 

In this case, that the buffer has only 2 float4x4, is the best solution to have N-frame buffers where N is the number of queued frames? I do this for my cmd list allocators.

 

Are these the best resources to learn about "resource synchronization" and all that stuff (because I definitely will need to improve my knowledge about this D3D12 topic)?

 

Thanks Jesse, you discovered a very difficult problem I had. I will change the code and comment in this thread about how it was.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!