Weird corrupt buffers when created concurrently

Graphics and GPU Programming Programming

Started by happygoat March 19, 2015 03:17 AM

7 comments, last by Tispe 9 years ago

109

Author

March 19, 2015 03:17 AM

For some unknown reason, I get corrupted vertex/index/texture buffers (leading to glitchy rendering) in some rare cases when I concurrently create those resources while rendering in a different thread.

Here's a breakdown:

I have a render thread that renders some UI stuff (e.g. a loading progress bar) while a loader thread loads data off the disk. The loader thread calls some D3D11Device Create functions with initialization data (pSysMem of D3D11_SUBRESOURCE_DATA is valid). For some hair pulling reason, some buffers that are initialized this way end up getting corrupted at some point. I am 100% sure that the data being passed in is correct (I verify it before calling Create). When I copy and stage the data to read it back from the CPU (after all loading has completed), indeed the buffer data is not the original data I initialized it to. The buffer device objects themselves are perfectly fine (GetDesc returns the correct creation params).

The buffers that get corrupted are completely random. If I reload the same scene over and over again, it will be a different buffer that gets corrupted each time. This indicates some sort of race condition. The code never crashes, so probably no CPU side stacks/heaps are being corrupted.

I've verified that I never access the ImmediateDeviceContext (of the Render thread) from the Loader thread (or DXGI stuff either). I've remarked all Map/UpdateSubresource calls to ensure I'm not accidentally writing passed buffer ends. The device is being created with the correct flags (i.e. no SINGLE_THREADED flag is set).

This problem does not occur if I load synchronously (i.e. block rendering while loading).

A workaround to this problem is to create a DeferredDeviceContext and use UpdateSubresource to initialize the buffers through the DeferredDeviceContext then flush it in the main render thread when the Loader completes. I lose the ability of defining the buffers as IMMUTABLE and possibly some concurrent driver cleverness in getting the data to the GPU since the copies are really happening when the DeferedContext is flushed in the main render thread.

I've been able to reproduce this problem on different devices (GTX 970 and 680) albeit on the same system.

Has anyone encountered this problem before or have any insights? The documentation makes clear that D3D11Device Create functions are supposed to be re-entrant safe. What could possibly corrupt buffer data on the device?

Many thanks in advance.

cgrant

1,874

March 20, 2015 06:14 PM

How are you guaranteeing that all the data is actually finished copying before using it for drawing ?

Brain

19,059

March 20, 2015 06:23 PM

What kind of thread safety do you have to separate your drawing from your updates?

You should have a queue of scenes, and only pass completed scenes to the draw thread, so that they can't be changed whilst being drawn. Completed scenes are disposed of along with any resources no longer needed.

Please let me know if this helps.

Games/Projects Currently In Development:
Discord RPG Bot | D++ - The Lightweight C++ Discord API Library | TriviaBot Discord Trivia Bot

happygoat

109

Author

March 20, 2015 07:08 PM

How are you guaranteeing that all the data is actually finished copying before using it for drawing ?

I'm assuming the driver needs to assert this condition somehow in order to support Creates (with supplied data) asynchronously. That is, when Create is called with init data, the driver takes care of getting it to the device. If I use it before the driver has copied the data to the device, I would expect the driver to block somehow, but unfortunately I don't know exactly the driver implementations regarding this.

happygoat

109

Author

March 20, 2015 07:16 PM

What kind of thread safety do you have to separate your drawing from your updates?

You should have a queue of scenes, and only pass completed scenes to the draw thread, so that they can't be changed whilst being drawn. Completed scenes are disposed of along with any resources no longer needed.

Please let me know if this helps.

I effectively poll an atomic value shared between the render and loader thread until the loader thread completes and sets the value to indicate completion. Until then, none of the data loaded by the loader is accessed by the render thread.

happygoat

109

Author

March 20, 2015 07:21 PM

Thanks for folks help, it's appreciated.

Quick update, I was unable to reproduce this on a different system. OK, a sample size of 2 is probably not going to convince me it's a driver problem yet but I really have no other explanations for corrupted data on the device.

Jason Z

6,437

March 21, 2015 12:30 AM

Perhaps your synchronization method isn't quite correct? Have you tried using something more heavy weight than an atomic? It might also be interesting if you add a sleep statement after your resource creation but before you set your atomic to see if a time delay helps the situation.

Jason Zink :: DirectX MVP

Direct3D 11 engine on CodePlex: Hieroglyph 3

Direct3D Books: Practical Rendering and Computation with Direct3D 11, Programming Vertex, Geometry, and Pixel Shaders
Articles: Dual-Paraboloid Mapping Article :: Parallax Occlusion Mapping Article (original):: Fast Silhouettes Article

Games: Lunar Rift

happygoat

109

Author

March 26, 2015 07:32 PM

So the system where this problem manifests had two video cards (GTX 980+680). When I pulled one video card, I was unable to reproduce. So I'm thinking this might be a driver issue with multiple non-SLI-ed video cards being used at the same time (one monitor attached to each video card for extended desktop, but only actually rendering to one card in the program). Either that or removing the second card changed the timings of things and this problem has become hidden to my test cases...

Anyway I hope this helps anyone encountering similar problems. I'll post again if it re-manifests.

Tispe

1,468

March 30, 2015 07:48 AM

Does your render thread also create the ID3D11Device and ID3D11DeviceContext?

Weird corrupt buffers when created concurrently

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Weird corrupt buffers when created concurrently

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines