Multithreaded Rendering

Started by
20 comments, last by akhin 12 years, 11 months ago
Hi all ,

For my MSc dissertation project , I want to work on multithreaded rendering, which would also involve working with multiple cores by either setting thread affinity or using OpenMP.

As for my research area , I target game industry rather than graphics industry and I want to find out how beneficial MT rendering would be , when it would be beneficial ( considering CPU overheads ) etc...

As for my scope I am only interested in graphics rendering , so I don't have physics ai or sound.

My questions are :

1.Data structures : I am confused about data structures which needs to be built to achieve this. I was firstly thinking of scene graphs. But I think what I am after should be so called
"display lists".

1a) As I can see "display lists" seems more suitable for the topic rather than scene graphs. What do you think about this ?

1b) Are "command buffer"s and "display list"s the same thing as from the point of data structures ?

2. Multithreading model : As for the simplest MT rendering model : For ex I will be rendering DisplayList1 and I will be "updating" DisplayList2. Then I will be rendering
DisplayList2 and updating DisplayList1 and so on.

What would you say this for simple flow ?

Also would rendering different vertex buffer with associated shaders at the same time (in different threads) be possible ?

3. Workload (Update phases) : For update phases , what would I have ? I can currently think of creating reflection maps , shadow maps ,postprocessing like HDR ,
updating animated meshes , CPU based particle systems and world matrix updates of static objects in the scene on a seperate thread or threads.

Would those be enough as a workload which would run in a seperate thread or threads ?

Also would it be possible to execute occlusion culling and spatial partitioning on seperate threads ?

4. D3D Version : As my existing 3d graphics engine is for D3D9 , I plan going with it , I need to create display lists or approriate data structures. As far as I know , you cant
create "resources" and "lock" and "unlock" them in seperate threads in D3D9 and D3D10. But for the rest of things I mentioned in previous workload question , would D3D9 be
ok for this job ?

5. GPUs : I dont have enough knowlede about capabilities and limits of current GPUs. Is there any thing I could do for "MT rendering" topic specifically with GPUs ?

Also are there any bottlenecks GPUs would cause for this topic ?

Many thanks to answerers in advance
Advertisement
On DX9 and DX10 you're stuck with one device for issuing rendering and state commands. If you create the device with the proper flags you can call methods on the device from multiple threads, but doing this will just use a coarse lock which will effectively serialize any threads attempting to concurrently access the device. So it's not really useful for the purpose of "multithreaded rendering", unless your goal is to simply have one worker "rendering" thread that issues all of the D3D-level rendering commands. For true multithreaded rendering where multiple concurrent threads issue render commands, you need DX11. No GPU's actually support multithreaded command submission at the driver level, but the D3D11 runtime can emulate it and you can still get a performance boost out of it (around 20-40% is a reasonable expectation). Without DX11, the best you could do is prepare your own data structures containing rendering/state data that's optimized for use with D3D, and then submit all of that data to D3D on one thread.

No GPU's actually support multithreaded command submission at the driver level, but the D3D11 runtime can emulate it and you can still get a performance boost out of it (around 20-40% is a reasonable expectation).


The most recent NV drivers do now support this at the driver level; they did for a few revisions before but only for Civ5.
<br />On DX9 and DX10 you're stuck with one device for issuing rendering and state commands. If you create the device with the proper flags you can call methods on the device from multiple threads, but doing this will just use a coarse lock which will effectively serialize any threads attempting to concurrently access the device. So it's not really useful for the purpose of &quot;multithreaded rendering&quot;, unless your goal is to simply have one worker &quot;rendering&quot; thread that issues all of the D3D-level rendering commands. For true multithreaded rendering where multiple concurrent threads issue render commands, you need DX11. No GPU's actually support multithreaded command submission at the driver level, but the D3D11 runtime can emulate it and you can still get a performance boost out of it (around 20-40% is a reasonable expectation). Without DX11, the best you could do is prepare your own data structures containing rendering/state data that's optimized for use with D3D, and then submit all of that data to D3D on one thread.<br />


Thank you for your answer ,

1. If I go with D3D9 as it seems I cant produce shadow maps , HDR stuff and whatever involves rendering in a seperate thread but it might be just
updating world matrices per static objects , updating bones and matrices of animtated meshes , particle systems and culling ( view frustum and spatial partitioning and occlusion culling etc...)

Do you think that they would bring a significant performance improvement or do you think it is worth to try with D3D9 ?

2. I currently have a simple 3d graphics engine of myself written for D3D9 and SM2/SM3 shaders. ( Actually it is more like a framework , it is a collection of wrappers and shaders)

How much effort would I need generally to wrap things and implement existing rendering in D3D11 and SM5 ?
If you are programming for multithreading then you should at least change to dx10, if not 11. Most users who have multiple cores will also have at least a dx10 dard. There is a BIG jump between dx10 and 9, but the change between dx10 and 11 is a very small and easy one to make.
It would be benifical to start on either dx10 or 11 right now before you invest too much in dx 9.
Wisdom is knowing when to shut up, so try it.
--Game Development http://nolimitsdesigns.com: Reliable UDP library, Threading library, Math Library, UI Library. Take a look, its all free.

If you are programming for multithreading then you should at least change to dx10, if not 11. Most users who have multiple cores will also have at least a dx10 dard. There is a BIG jump between dx10 and 9, but the change between dx10 and 11 is a very small and easy one to make.
It would be benifical to start on either dx10 or 11 right now before you invest too much in dx 9.



If I go with DX11 , I would do a simple CPU-expensive demo and try to render it in multiple threads. So I would not spend time to write
wrappers or to port my engine but just use DXUT with DX11.

For that case to be able to come up with a good MT rendering demo , what kind of scene would you suggest to render , perhaps ocean water simulation with FFT for wave displacements?

Or just rendering a huge model ?
I would like to offer suggestions, but I don't understand what you are trying to accomplish exactly Are you trying to take a rendering system, add some threading to it and see the results? I am being simplistic here, but is that what you are aiming to do? If so, it will be very difficult to get speed-ups from threading because most of the work is done on the GPU, not the CPU. It is very difficult to add enough CPU intensive stuff into an engine in a short amount of time (unless you decide to go crazy with draw calls). I have an engine that does alot and I still run under 6% cpu load on a single core because there just isnt enough CPU work to do. Perhaps you could do something like a multithreaded CPU ray tracing engine? I dunno . . . .

It seems very difficult to quantify the benefit of multithreading on an engine because there are sooo many factors that contribute to the engine. Since the GPU is faster at doing most graphic work than the CPU ( because it is specialized) it would be a mistake to offload work from the GPU to the CPU for the benefit of threading. Since the CPU is generally used for other things: netowkring, sound, AI, User Interface, scene managment, etc. Also, in correctly written engines, draw calls are small, as are state changes, so getting an increase in threading would only happen from a naive implementation of an engine where there are alot of unnecessary CPU to GPU interaction. Back in the day, the CPU and GPU worked together, now the trend is moving to very little GPU to CPU interaction.

Anyway, fire away !!!!
Wisdom is knowing when to shut up, so try it.
--Game Development http://nolimitsdesigns.com: Reliable UDP library, Threading library, Math Library, UI Library. Take a look, its all free.

[quote name='MJP' timestamp='1305145988' post='4809540']
No GPU's actually support multithreaded command submission at the driver level, but the D3D11 runtime can emulate it and you can still get a performance boost out of it (around 20-40% is a reasonable expectation).


The most recent NV drivers do now support this at the driver level; they did for a few revisions before but only for Civ5.
[/quote]

Sweet, I didn't know that. Too bad I have an AMD at home. :(

I would like to offer suggestions, but I don't understand what you are trying to accomplish exactly Are you trying to take a rendering system, add some threading to it and see the results? I am being simplistic here, but is that what you are aiming to do? If so, it will be very difficult to get speed-ups from threading because most of the work is done on the GPU, not the CPU. It is very difficult to add enough CPU intensive stuff into an engine in a short amount of time (unless you decide to go crazy with draw calls). I have an engine that does alot and I still run under 6% cpu load on a single core because there just isnt enough CPU work to do. Perhaps you could do something like a multithreaded CPU ray tracing engine? I dunno . . . .

It seems very difficult to quantify the benefit of multithreading on an engine because there are sooo many factors that contribute to the engine. Since the GPU is faster at doing most graphic work than the CPU ( because it is specialized) it would be a mistake to offload work from the GPU to the CPU for the benefit of threading. Since the CPU is generally used for other things: netowkring, sound, AI, User Interface, scene managment, etc. Also, in correctly written engines, draw calls are small, as are state changes, so getting an increase in threading would only happen from a naive implementation of an engine where there are alot of unnecessary CPU to GPU interaction. Back in the day, the CPU and GPU worked together, now the trend is moving to very little GPU to CPU interaction.

Anyway, fire away !!!!



If I add physics calculations , do you believe it would make sense to investigate benefits of multithreading in a game engine then ?

[quote name='phantom' timestamp='1305149376' post='4809565']
[quote name='MJP' timestamp='1305145988' post='4809540']
No GPU's actually support multithreaded command submission at the driver level, but the D3D11 runtime can emulate it and you can still get a performance boost out of it (around 20-40% is a reasonable expectation).


The most recent NV drivers do now support this at the driver level; they did for a few revisions before but only for Civ5.
[/quote]

Sweet, I didn't know that. Too bad I have an AMD at home. :(
[/quote]

Aye, same here.. I've only seen it working at work which is where I was looking closely at this.. NV might well be my next card at home now... <_<

This topic is closed to new replies.

Advertisement