Massive performance increase after migrating engine from DX10 to DX11.

Started by
7 comments, last by BrioCyrain 13 years, 3 months ago
... but... why?
I'm asking this mostly out of curiosity, why would this happen?

I've written my own little deferred renderer using SlimDX with DX10 some time ago, and I was having a good time with it.
It was working well, and faster than I anticipated (mostly thanks to all the wonderful performance suggestions on these forums, especially from MJP).
I recently decided to migrate my engine to DX11.
It wasn't as hard as I imagined, it was pretty much a one day job (at least to get the basics working, I disabled some things I'm not needing immediately.
I set it to use compatibility level 10_0 (I got dx10 gpu's).

After the first successful compile, I was surprised to say the least.
My game was running so much faster, even though, besides switching to dx11 I changed nothing whatsoever.
The first thing I noticed is that the dx11 version uses > 90% of my gpu, the dx10 version uses ~75%. This would explain the increase in speed.
The really interesting aspect that the dx11 version scales SO MUCH better when adding dynamic lights, and I have no idea why.

Example with 500 small dynamic point lights in small test scene:
DX10:
- GPU usage 35%
- FPS 52

DX11:
- GPU usage 92%
- FPS 220

Basically what happens is that the DX10 version suffers from using many dynamic light sources.
There's a bottleneck somewhere which causes the GPU usage to go down.
I've never been able to track it down unfortunately.
The DX11 version doesn't seem to have this bottleneck at all.

Same thing with SLI:
If I use 2 way AFR SLI the DX10 GPU usage drops from 75% to 55%.
With DX11 GPU usage drops from ~92% to ~90% (it basically stays the same and nearly halves frame times)

Now don't get me wrong, I'm loving this, but I'm still curious what could be causing this?
I mean... this is quite extreme, the DX11 version runs over 2000 dynamic point lights and still stays over 100fps easily, wheras the DX10 one begins struggling at "only" 300 point lights.
The scenes are exactly identical, so are the shaders and the code (I only changed everything DX10 related to DX11)

The only thing I can think of is that DX11 use a more efficient way of sending data to the GPU?

Cheers,
Hyu
Advertisement
I dont really know but my guess is that SlimDX was not making full use of DirectX 10.

But it made me think about migrating to DirectX 11.
I've never heard anyone making this type of claim before, but perhaps your driver has better support for DX11? From the numbers you have mentioned, it clearly shows that the CPU is less of a bottleneck, which allows more GPU to be used. This would seem to indicate that the submission process is better supported in DX11.

DX11 does introduce multithreading, but you need to intentionally use it to gain the advantages from it. However, perhaps the required changes in driver allowed for some other optimizations at the same time??? Who knows, but congrats on the performance increase!
Driver optimizations are certainly a possibility, I hadn't thought of that.
Ah well, I was just curios if it was something obvious.
I'm glad I got rid of my bottleneck, even if it was in a bit of a weird fashion.
It would be interesting to see the code before and after the migration.

Beginner in Game Development?  Read here. And read here.

 

It's quite a lot of code unfortunately, I've been playing with this for nearly a year.
I basically just changed the using SlimDX.Direct3D10; statement to using SlimDX.Direct3D11;
I changed all device calls to use device.ImmediateContext and modified shader compilation so use effect profile fx_5_0 instead of fx_4_0.
The device&swapchain is created with FeatureLevel.Level_10_0.
Apart from that I just modified anything the compiler complained about (mostly renamed methods etc).
Blimey, no wonder everyone is all over DX11 like a parasite...it actually works!

Glad to hear the massive increase in performance.
I Think I finally know what happened, and it's something quite stupid really.
After doing a Build -> Clean on the DX10 project, it runs nearly as fast as the DX11 one...

I've changed the point light rendering routine recently so they are rendered using instanced bounding volumes, instead of using a draw call for each light.
I assume, even though the code was setup otherwise, the DX10 project was still using the drawing routine without instancing.
The Build -> Clean forced a complete recompile and fixed it. :/

Note to self: Do a full rebuild every now and then.
Glad you found the cause, pretty cool nonetheless though.

This topic is closed to new replies.

Advertisement