Sign in to follow this  
keinmann

DX11 Engine design questions...

Recommended Posts

Hey guys,

The porting of my old code base over to SlimDX (using exclusively Dx11) has been going quite well. But I'm not liking the way some of my old, naive code is looking. A lot of things are so radically different from Dx9/XNA -> Dx11 that it facilitates throwing the old junk away and writing completely new code. So I've been posed with a few questions that I'd like to get some peer feedback on.

FYI, so you know, I'm trying my best to abstract away a lot of the pain involved with writing straight DX applications with this engine, but not so much that those who WANT lots of nit-picky features can't get to them (not an easy feat, and sometimes not possible).

1) Graphics Settings :

Any engine worth its salt has an fairly intuitive way of changing graphics settings on the fly and often by configuration files; thus enabling programmers to create nice little graphics settings/options menus for their games. I'm brand new to DX11/SlimDX API, so I'm probably missing a lot of stuff and just not seeing it. I'm from the days of "PresentationParameters" and all that jazz. :) So I'm wondering, to begin with, what range of graphics settings/options will be vital to a SlimDX/DX11-based engine? What things do I need to look up and learn about to make an effective graphics settings system? Any particular classes in the API that can help me not reinvent the wheel (as I often do when I'm ignorant of an API's features)? What things are/may be rarely used and not very important to include?

One other question that's been bugging me is about the ModeDescription property for SwapChainDescriptions, concerning their refresh rates. Obviously, when windowed, your application can effectively use any resolution which will fit in the display bounds and match the window size. But what about the refresh rate? Do we have to perfectly match the display mode's refresh rate in windowed mode, or is it only a requirement in full-screen? I remember the DX11 doc said it's vital to get the refresh rate correct in full-screen.

Also, are there any examples on the proper way to switch graphics (Device/SwapChain) settings at runtime? I'm not really sure if the way I switch from windowed to fullscreen is a "good" way of doing things lol. The limited samples and incomplete docs have left me guessing on a lot of things. :P

2) "Main Loop" and timing:

I've already implemented a new, internal clock/timer class which is very similar to the one buried in the XNA framework; although it's lighter and cleaner imho. I've tested it and it's as accurate as it gets on my system. In XNA, the "Game" class is fixed-timestep by default, and has a target update interval of 1/60th second (roughly 16.666something6667-something ms). Question is, what is the best way to ensure you stay within a reasonable tolerance of your update interval? What if the loop completes super fast and you have only a tiny bit of elapsed time between updates? Do you intentionally wait a bit and try to adjust the speed? Or should you just rock out the Update(...) calls as fast as they can go?

I'm also planning to add the ability to have parallel update and rendering loops on separate threads; each keeping their own time and handling their own problems. So I'm wondering if anyone has some advice for me on this front as well.

Wow, big post. Any input/tips/advice will be greatly appreciated. I'm just trying to figure out where I'm going with my ideas and what things I should be aware of before I make big, time consuming mistakes!

Share this post


Link to post
Share on other sites
Quote:
Original post by keinmann
What things do I need to look up and learn about to make an effective graphics settings system?
I'm not particularly familiar with the API, but DX11 has "feature levels" that describe what the graphics card is capable of. If you're using new features added in DX10/11, but want to have fallbacks for older-cards, make sure you check the card's feature level.
Quote:
Question is, what is the best way to ensure you stay within a reasonable tolerance of your update interval? What if the loop completes super fast and you have only a tiny bit of elapsed time between updates? Do you intentionally wait a bit and try to adjust the speed? Or should you just rock out the Update(...) calls as fast as they can go?
A lot of games use a fixed time-step, where update is only called if 16.67ms have elapsed. If 33.3ms have elapsed, you call update twice, etc...
If less than 16ms have elapsed, you hand your time-slice back to the CPU.
On Windows there's functions like YieldProcessor, SwitchToThread, Sleep, etc for this.
Quote:
I'm also planning to add the ability to have parallel update and rendering loops on separate threads; each keeping their own time and handling their own problems. So I'm wondering if anyone has some advice for me on this front as well.
If you go down this path, I'd keep them separated and communicate via double-buffered state. To do this, you have two 'communication' structures. At any one time, the Update thread is writing to one, and the Render thread is reading from the other. At the end of each frame, the threads synch up and swap structures with each other.

However, this isn't the most scalable approach to threading -- you're targeting a dual core processor. Quad-core is standard on PCs now, and consoles have been doing ~6 hardware threads for years now.
Another approach is the task-pool model, where you make one worker thread per core (or per hardware thread), and then break up your update and render code into lots of isolated tasks that communicate via asynchronous message passing.

Share this post


Link to post
Share on other sites
Quote:
Original post by keinmann
Question is, what is the best way to ensure you stay within a reasonable tolerance of your update interval? What if the loop completes super fast and you have only a tiny bit of elapsed time between updates? Do you intentionally wait a bit and try to adjust the speed? Or should you just rock out the Update(...) calls as fast as they can go?


A common way is to use an accumulator.

float accumulator;
const float TimeStep = 0.2f;
public void Frame(float dt) {
accumulator += dt;
while (accumulator > TimeStep) {
accumulator -= TimeStep;
Update(TimeStep); // fixed timestep update
}
}

Share this post


Link to post
Share on other sites
Thanks for the answers everyone. And I like the article, Dranith. I've also used variable time-step and just let it run as fast as possible. But now I know exactly how to implement a reliable fixed-timestep system. I think it will be important to do so with the accuracy I demand of the physics, especially aerodynamics.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Announcements

  • Forum Statistics

    • Total Topics
      628400
    • Total Posts
      2982445
  • Similar Content

    • By KarimIO
      Hey guys,
      I'm trying to work on adding transparent objects to my deferred-rendered scene. The only issue is the z-buffer. As far as I know, the standard way to handle this is copying the buffer. In OpenGL, I can just blit it. What's the alternative for DirectX? And are there any alternatives to copying the buffer?
      Thanks in advance!
    • By joeblack
      Hi,
      im reading about specular aliasing because of mip maps, as far as i understood it, you need to compute fetched normal lenght and detect now its changed from unit length. I’m currently using BC5 normal maps, so i reconstruct z in shader and therefore my normals are normalized. Can i still somehow use antialiasing or its not needed? Thanks.
    • By 51mon
      I want to change the sampling behaviour to SampleLevel(coord, ddx(coord.y).xx, ddy(coord.y).xx). I was just wondering if it's possible without explicit shader code, e.g. with some flags or so?
    • By GalacticCrew
      Hello,
      I want to improve the performance of my game (engine) and some of your helped me to make a GPU Profiler. After creating the GPU Profiler, I started to measure the time my GPU needs per frame. I refined my GPU time measurements to find my bottleneck.
      Searching the bottleneck
      Rendering a small scene in an Idle state takes around 15.38 ms per frame. 13.54 ms (88.04%) are spent while rendering the scene, 1.57 ms (10.22%) are spent during the SwapChain.Present call (no VSync!) and the rest is spent on other tasks like rendering the UI. I further investigated the scene rendering, since it takes über 88% of my GPU frame rendering time.
      When rendering my scene, most of the time (80.97%) is spent rendering my models. The rest is spent to render the background/skybox, updating animation data, updating pixel shader constant buffer, etc. It wasn't really suprising that most of the time is spent for my models, so I further refined my measurements to find the actual bottleneck.
      In my example scene, I have five animated NPCs. When rendering these NPCs, most actions are almost for free. Setting the proper shaders in the input layout (0.11%), updating vertex shader constant buffers (0.32%), setting textures (0.24%) and setting vertex and index buffers (0.28%). However, the rest of the GPU time (99.05% !!) is spent in two function calls: DrawIndexed and DrawIndexedInstance.
      I searched this forum and the web for other articles and threads about these functions, but I haven't found a lot of useful information. I use SharpDX and .NET Framework 4.5 to develop my game (engine). The developer of SharpDX said, that "The method DrawIndexed in SharpDX is a direct call to DirectX" (Source). DirectX 11 is widely used and SharpDX is "only" a wrapper for DirectX functions, I assume the problem is in my code.
      How I render my scene
      When rendering my scene, I render one model after another. Each model has one or more parts and one or more positions. For example, a human model has parts like head, hands, legs, torso, etc. and may be placed in different locations (on the couch, on a street, ...). For static elements like furniture, houses, etc. I use instancing, because the positions never change at run-time. Dynamic models like humans and monster don't use instancing, because positions change over time.
      When rendering a model, I use this work-flow:
      Set vertex and pixel shaders, if they need to be updated (e.g. PBR shaders, simple shader, depth info shaders, ...) Set animation data as constant buffer in the vertex shader, if the model is animated Set generic vertex shader constant buffer (world matrix, etc.) Render all parts of the model. For each part: Set diffuse, normal, specular and emissive texture shader views Set vertex buffer Set index buffer Call DrawIndexedInstanced for instanced models and DrawIndexed models What's the problem
      After my GPU profiling, I know that over 99% of the rendering time for a single model is spent in the DrawIndexedInstanced and DrawIndexed function calls. But why do they take so long? Do I have to try to optimize my vertex or pixel shaders? I do not use other types of shaders at the moment. "Le Comte du Merde-fou" suggested in this post to merge regions of vertices to larger vertex buffers to reduce the number of Draw calls. While this makes sense to me, it does not explain why rendering my five (!) animated models takes that much GPU time. To make sure I don't analyse something I wrong, I made sure to not use the D3D11_CREATE_DEVICE_DEBUG flag and to run as Release version in Visual Studio as suggested by Hodgman in this forum thread.
      My engine does its job. Multi-texturing, animation, soft shadowing, instancing, etc. are all implemented, but I need to reduce the GPU load for performance reasons. Each frame takes less than 3ms CPU time by the way. So the problem is on the GPU side, I believe.
    • By noodleBowl
      I was wondering if someone could explain this to me
      I'm working on using the windows WIC apis to load in textures for DirectX 11. I see that sometimes the WIC Pixel Formats do not directly match a DXGI Format that is used in DirectX. I see that in cases like this the original WIC Pixel Format is converted into a WIC Pixel Format that does directly match a DXGI Format. And doing this conversion is easy, but I do not understand the reason behind 2 of the WIC Pixel Formats that are converted based on Microsoft's guide
      I was wondering if someone could tell me why Microsoft's guide on this topic says that GUID_WICPixelFormat40bppCMYKAlpha should be converted into GUID_WICPixelFormat64bppRGBA and why GUID_WICPixelFormat80bppCMYKAlpha should be converted into GUID_WICPixelFormat64bppRGBA
      In one case I would think that: 
      GUID_WICPixelFormat40bppCMYKAlpha would convert to GUID_WICPixelFormat32bppRGBA and that GUID_WICPixelFormat80bppCMYKAlpha would convert to GUID_WICPixelFormat64bppRGBA, because the black channel (k) values would get readded / "swallowed" into into the CMY channels
      In the second case I would think that:
      GUID_WICPixelFormat40bppCMYKAlpha would convert to GUID_WICPixelFormat64bppRGBA and that GUID_WICPixelFormat80bppCMYKAlpha would convert to GUID_WICPixelFormat128bppRGBA, because the black channel (k) bits would get redistributed amongst the remaining 4 channels (CYMA) and those "new bits" added to those channels would fit in the GUID_WICPixelFormat64bppRGBA and GUID_WICPixelFormat128bppRGBA formats. But also seeing as there is no GUID_WICPixelFormat128bppRGBA format this case is kind of null and void
      I basically do not understand why Microsoft says GUID_WICPixelFormat40bppCMYKAlpha and GUID_WICPixelFormat80bppCMYKAlpha should convert to GUID_WICPixelFormat64bppRGBA in the end
       
  • Popular Now