Advertisement Jump to content
Sign in to follow this  
Migi0027

DX11 DX11 - Extremely low performance

This topic is 1867 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi guys. wacko.png

 

So, my engine isn't the best, not the fastest, but gives some cool results. A scene with nothing but a plane (With parallax occlusion mapping) and a cube in the middle will be around 30-36 fps, and without the POM, maybe 36-47 fps, which is terrible. Though to that scene with POM, I can add (as an example) a tropical plant with around 320 polygons, and instance that to around 100-200 instances within a short radius, with a fps loss of 3-7. So, my scenes always seem to have ~30 fps when almost empty, then in my game (actual numbers) when I add 895 instances of different types of plants and some few other meshes and a tessellated and displaced terrain, my fps seems to vary between 17-23. Which is really slow!

 

So the real problem may not be when actually rendering the meshes / terrains / etc... But maybe something before, but I'm really not sure what. I've tried profiling, didn't really give me anything (Maybe a lack of experience, using the profiling in Visual Studio). So, I though it might be at the GPU stage, I do have some heavy operations being carried out at the pixel shaders, which maybe shouldn't, but, as my fps already is as low as 30 even before rendering more than 2 meshes, the 'area' should lie before the rendering.

 

I'm not sure what to give you, if any, in terms of code, as I'm not sure where it happens, and I can't just supply you with ~(15000 - 35000) lines of code.

 

Any ideas what this might be? I do realize that there is almost nothing to go after, but, I'm not really sure what I should supply you with. If anything, please say so. unsure.png

 

Confession: I do tend to not focus much on the performance on the code I write, but, I guess it took its revenge.

PS: I'll be back within 12 hours.

 

Thanks, as always.

-MIGI0027

Edited by Migi0027

Share this post


Link to post
Share on other sites
Advertisement

Profiling shows you what operations take the most of the time.

 

Generally:

  • Map/Unmap during frames can slow you down, if you don't use some kind of double- (or n-) buffering on the resources you map. If this was the source of the slowness, it would be caused by the fact that the device (or your software) may need to wait access to data if it is in some other use at the same time. 
  • The less API calls you make, the better, because the most critical calls (like most of the stuff on the immediate device context) cross the user/kernel boundary and the OS (and the processor) performs rigorous access checks on that boundary. This eats CPU time.
  • Excessive overdraw can eat your fillrate quickly; try to render foreground first (to fill depth buffer as early as you can); then, subsequent drawing can take early exits if it is "behind" the existing depth.
  • If you use textures (especially large ones), use mipmaps as well for a huge difference in cache performance.
  • Some budget cards are simply slow; this will generally affect fillrate a lot, especially if you use large resolutions. To test this, lower the render target resolution to see if it makes a difference in the frame rate.
  • "Simple" pixel shaders can be internally very complex (if you use dynamic conditionals and stuff like that); be sure to observe and understand the shader compiler output to see that there is no obvious slow parts there (unless you actually need them). 

This is not an exhaustive list, just some of the issues I've had to deal with in the past.

 

FPS is a poor way to measure performance; Milliseconds per frame is much more meaningful (because you want to know how much time a given operation takes). Time per frame is a reciprocal of the FPS.

Edited by Nik02

Share this post


Link to post
Share on other sites

Recently I found something that worries me, though I'm not sure why it is there. And what does the 'time' symbol mean next to the commands mean? (Time consuming?) :

 

So when rendering the frame, it really looks like it's re creating loads of objects, which I don't really understand. When looking at the CreateDomainShader..CreateHullShader, it happens within my highlighted area of code, though, it's not really clear why it reports it? (Is it something that happened in the past?)

 

11jyons.png

Share this post


Link to post
Share on other sites

Yes, it's something that happened in the past. It's there so you can check it out even if it didn't happen in the frame that you are debugging.

 

PS: if you want to profile performance I suggest you try Intel GPA.

Edited by Mona2000

Share this post


Link to post
Share on other sites

What frame rate do you get with no meshes in the frame?  It sounds like you have vsync enabled to start with - that might be the first place to make a change.  Can you describe your rendering setup a bit more?  For example, do you use deferred shading, generate gbuffers, or something else expensive?  How about MSAA?

Share this post


Link to post
Share on other sites

In a complete empty scene I get 34-35 fps.

 

VSync is disabled and it's windowed. I'm following the deferred pipeline, the engine goes through these steps each frame:

  • Update PhysX and Lua
  • Update All Actors (Should be 0 in the empty scene)
  • if Shadows, render them
  • if GI, render it (Irrelevant here)
  • (Skipping some steps)
  • Render Scene (Outputs all the data to a gbuffer)
  • if SSAO,...
  • Render Glow map
  • Render Particles
  • Do the special full screen post processing (Bloom, Volumetrical Effects ... Lighting)

I do believe that MSAA is disabled. I think that the CPU is actually ahead of the GPU as it has to 'wait' for the GPU to complete it's commands (The CPU waits on the ->Present), but, that doesn't really help on the empty scene.

Share this post


Link to post
Share on other sites

It would be nice to know your hardware as performance metrics don't mean anything by themselves.

I barely hit 30FPS on 1st gen atom systems. I'm good with it.


So when rendering the frame, it really looks like it's re creating loads of objects, which I don't really understand
I'm sorry to say, but if you lost conceptual integrity of your code you're better pick an axe and start chopping.

 

I think your engine looks very much like mine several years ago. It could do a lot of advanced things but in the end it didn't have a robust, trustable core. I had to chop away all features that were not fitting. Years later, the current code base is still not functionally equivalent to the old, albeit it has developed other traits which are highly more desirable in my opinion.

Share this post


Link to post
Share on other sites

Thanks.

 

I'm on a GTX 560M, able to run BF4 on Medium-High.

 

The recreation of objects was a mistake, it didn't actually happen, I just misinterpreted the debug information, so I guess I'm still good on that side.

Share this post


Link to post
Share on other sites

Are you running the debug version of D3D ? 

 

Cheers!

 

Nope, the debug version slows it even further down. tongue.png

 

Is there any kind of information that I can supply you with? I'm feeling like you have nothing to go after...

 

-MIGI0027

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!