How do you multithread in Directx 11?

Started by
5 comments, last by Brian Klamik 11 years, 8 months ago
I looked around the net and the forums,but I couldn't find basic instructions on how to multithread in DirectX11.From what I understood,you have to create the device and the context with some different flags?Is there any tutorial or post that explains how to pull it off?What are the actual benefits of multithreading on Dx11?I have a deferred renderer that works fine,but from what I heard,to really make the performance acceptable,you have to implement multithreading?I also saw something like that in the Frostbite 2 technology presentation pdf.
Advertisement
from what I heard,to really make the performance acceptable,you have to implement multithreading?
How many milliseconds does your game currently use to perform all of it's D3D calls on the CPU? Multi-threading your D3D calls is an optimisation, and the first step in optimizing is always to take measurements.
Keep in mind, the bulk of rendering takes place on the GPU, with the CPU only managing resources and submitting commands - so you'll only need to multi-thread your CPU code if you need to submit a LOT of commands, or if you do a lot of resource management at the same time as rendering.
It takes 7 miliseconds to render a large building with 10 large directional lights and about 11 miliseconds for 60 large directional lights.I'm not very happy about the performance right now and I'm gonna optimize some more and implement instancing,but from what I understood modern engines like Frostbite 2 have both instancing and multi-threaded rendering.The thing is I have no idea how to implement multithreading,what changes do I have to make to my device and context creation?Currently I'm just using D3D11CreateDeviceAndSwapChain(NULL, D3D_DRIVER_TYPE_HARDWARE, NULL, 0, &featureLevel, 1, D3D11_SDK_VERSION, &swapChainDesc, &swapChain, &DEVICE, NULL, &CONTEXT));

It takes 7 miliseconds to render a large building with 10 large directional lights and about 11 miliseconds for 60 large directional lights.I'm not very happy about the performance right now and I'm gonna optimize some more and implement instancing,but from what I understood modern engines like Frostbite 2 have both instancing and multi-threaded rendering.The thing is I have no idea how to implement multithreading,what changes do I have to make to my device and context creation?Currently I'm just using D3D11CreateDeviceAndSwapChain(NULL, D3D_DRIVER_TYPE_HARDWARE, NULL, 0, &featureLevel, 1, D3D11_SDK_VERSION, &swapChainDesc, &swapChain, &DEVICE, NULL, &CONTEXT));

You aren't listening to what Hodgman is saying here. Do you know how much of that time is spent queuing up draw calls on the CPU? What he's getting at is that you may actually be GPU limited-- that is, your CPU is mostly farting around waiting for the GPU to do the work assigned to it. You'd ultimately end up making the CPU fart around even more for no actual performance gain and in fact stand to make it worse if you handle threading poorly-- many professionals still can't get this right, although that's probably more the result of mediocre teaching than any inherent difficulty.

For what it's worth, though, this and this (and even more specifically, these two methods) should help get you started.
clb: At the end of 2012, the positions of jupiter, saturn, mercury, and deimos are aligned so as to cause a denormalized flush-to-zero bug when computing earth's gravitational force, slinging it to the sun.
When/if you do determine that your single CPU thread is the bottleneck, you can create deferred contexts for each additional thread by using ID3D11Device::CreateDeferredContext.

Each thread that handles rendering tasks has its own deferred context. Once the secondary threads have done their tasks for the frame (which usually is the CPU-side heavy lifting), you play back their finished command lists on the primary context to actually submit the state changes and draw commands to the device.

The SDK has a programming guide article about this: "Immediate and Deferred Rendering".

Niko Suni

The deferred context requires a lot of care getting things right (in a multithreaded manner). For most of my applications, I don't bother. Instead, I create the device multithreaded and use it for resource creation and destruction. Done right, you can get the resource creation occurring in one CPU thread while another CPU thread is busy with the previous resources. When the second thread is ready to process new data, it (hopefully) is available in GPU memory. Always make sure you profile. For some of my applications it is faster to create/destroy resources each frame rather than map/unmap an already existing resource.
Did you try MSDN?

'Introduction to Multithreading in Direct3D11' and all the pages it references?

This topic is closed to new replies.

Advertisement