Jump to content
  • Advertisement

SoldierOfLight

Member
  • Content Count

    320
  • Joined

  • Last visited

Community Reputation

2278 Excellent

1 Follower

About SoldierOfLight

  • Rank
    Member

Personal Information

  • Role
    Programmer
  • Interests
    Programming

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. SoldierOfLight

    Drawing multiple objects

    Do you really set your vertex/index buffers *after* you issue the draw call?
  2. I"m afraid I have no idea what SharpDX does differently in debug or release configurations, the only way I know of to programattically change this behavior is by: 1. Changing your application EXE name. 2. Exporting the vendor-specific globals to change this behavior.
  3. There's nothing you can do. Fullscreen exclusive mode only supports monitors connected to the GPU where you're rendering. The one exception is in hybrid laptops, where the fullscreen mode has a cross-adapter copy before it hits the screen, but this scenario is only available if the adapter enumeration says that the monitor is connected to the discrete GPU, which generally happens based on IHV control panel settings or specifically-named exports from your EXE. Your only option is a borderless windowed approach if you want to explicitly target a different render adapter from the display.
  4. SoldierOfLight

    Silly Input Layout Problem

    You're passing a hardcoded 3 as the number of elements. Maybe try _countof(ied) or ARRAYSIZE(ied).
  5. Regarding the initial wait, the documentation for the waitable object does clearly call it out: Step 4: Wait before rendering each frame Your rendering loop should wait for the swap chain to signal via the waitable object before it begins rendering every frame. This includes the first frame rendered with the swap chain. As for your questions: 1) Correct, "Hardware [Composed]: Independent Flip" means that the application buffer is directly scanned out. 2) The only alternative is the legacy fullscreen flip. Both of these indicate no copies, with your buffer going straight to screen. 3) This is probably the amount of time after you called the Present API, while your GPU work is getting processed by the driver and hardware, and then waiting for the VBlank to actually flip it, and then the graphics scheduler acknowledging that the flip occurred. 4) PresentMon definitely monitors swapchains, not windows. So it's not that it suddenly switched to monitoring a standard GDI window. Unfortunately, independent flip state is something that's managed by the desktop compositor, so I don't really have any additional insight into why you'd exit it.
  6. SoldierOfLight

    Why we have a lot of 3d graphic APIs?

    OpenGL should really transition into this! I'll also take this opportunity to plug Direct3D11On12. While it's not (yet) open-source, it is otherwise basically what you asked for, a stable higher-level API which translates to a lower level API which is also accessible.
  7. So direct response to your questions: 1) Depending on your driver version, SetFullscreenState may do one of two things: it may simply maximize your window and try to get independent flip, or it may actually take exclusive ownership of the display. The semaphore used to implement the waitable object is only present in the former case, not the latter. Note that DX12 always implements fullscreen as the former, and therefore the restriction of this API is not present there. 2) No, you don't. The performance characteristics of an app which is in independent flip vs exclusive fullscreen should be identical. And as you noticed, on newer drivers, SetFullscreenState may not even do anything extra compared to a borderless window with FLIP_DISCARD swap effect. 3) Not sure, I'd need to see a trace of that scenario to tell you why it's higher latency. 4) Independent flip (or legacy fullscreen flip, i.e. exclusive fullscreen) are the best you can hope fore. Regarding your later tests, that should be able to get you down below 16ms. Have you measured the amount of GPU work that you're submitting? If you sleep for 15ms, but take longer than 1ms from the point of starting your rendering on the CPU to completing it on the GPU, you'll miss your VBlank and you'll end up with your frame queued for nearly another whole VBlank. You'd also end up at ~30fps, because your wait for the waitable object would block until the flip actually happened. Alternatively, are you waiting for the waitable object before the first frame? If you're not, you'll always have an extra frame of latency inherent in the system.
  8. SoldierOfLight

    Issues with D3D12CreateDevice

    According to https://en.wikipedia.org/wiki/Feature_levels_in_Direct3D#Direct3D_12, AMD doesn't support the HD 5800 series for D3D12; it's a GPU architecture that's too old to support the GPU virtual addressing functionality required by D3D12 and WDDM2.0 in general.
  9. SoldierOfLight

    CopyResource with BC7 texture

    You might want to read through https://docs.microsoft.com/en-us/windows/desktop/direct3d10/d3d10-graphics-programming-guide-resources-block-compression.
  10. SoldierOfLight

    Questions about D3D12_INPUT_ELEMENT_DESC

    The semantic names of non-system-value inputs to the vertex shader don't have any meaning. It's holdover from DX9 that a lot of apps use POSITION or TEXCOORD, where the names did actually matter. AlignedByteOffset of 1 probably won't work, unless your data is a single byte. The 'aligned' in the name means that the value needs to be aligned to the size of the element you're trying to read. So for a 4-byte value, a value of 0, 4, or 8 is valid, but 1 is not. Unless you're intentionally leaving space for data that a different input layout will reference (e.g. a shadow pass only needs one value, but the same vertex buffer will also be used for real rendering with more data), you should just use the APPEND_ALIGNED_ELEMENT value. I'll let someone else comment on best practices.
  11. SoldierOfLight

    CBVs in Descriptor Heap?

    Positive. A root parameter of type CBV is a root CBV. A root parameter of type descriptor table contains ranges, which may include CBVs in those ranges.
  12. SoldierOfLight

    CBVs in Descriptor Heap?

    That says that you made root parameter 0 a root CBV, not a descriptor table of CBVs.
  13. SoldierOfLight

    How to create large upload buffers?

    Sounds like the problem is coming from the driver. What driver(s) have you tried this on?
  14. Depends on the sizes we're talking about here. Generally, using Map(WRITE_DISCARD) on a CB results in re-allocating the entire buffer. If you're talking about a few bytes per texture, then it's probably more efficient to cram them all together, as individual CBs would waste a lot of memory in padding. If you're talking about a few hundred bytes per texture, it probably makes more sense to only re-allocate/bind the one that's relevant at the moment. The actual hardware implications of "binding" CBs and textures vary across different hardware, even on the same vendor, but generally binding a CB is very cheap, while binding textures is less cheap. If you're interested in not needing to change texture bindings as often, you can take a look at DX12 with its bindless (read: bind fewer times, not never) options. But I'd recommend profiling before deciding that changing your bindings is the bottleneck.
  15. You're probably looking for ID3D11Fence. Treat your D3D11 device as if it was another D3D12 command queue.
  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!