Jump to content
  • Advertisement


  • Content Count

  • Joined

  • Last visited

Community Reputation

1686 Excellent

1 Follower

About Klutzershy

  • Rank

Personal Information

  • Interests

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. In Vulkan, you can query each physical device (adapter) for the queue types it supports and the number of queues of each type.  What is the equivalent in D3D12?  I see lots of apparently conflicting information on MSDN and the rest of the web.  Sometimes, it seems D3D12 supports one queue of each type, and other times, it seems it supports more, particularly for compute and copy queues.  Are you just supposed to call ID3D12Device::CreateCommandQueue until it fails, like how you enumerate adapters by calling IDXGIFactory1::EnumAdapters1?  Do you create any number of logical queues of each type and let the driver handle scheduling them to the actual physical queues?  It's all very confusing.
  2. PIMPL (Pointer to IMPLementation, or "opaque pointer") is an idiom used for when you need "super" encapsulation of members of a class - you don't have to declare privates, or suffer all of the #include bloat or forward declaration boilerplate entailed, in the class definition. It can also save you some recompilations, and it's useful for dynamic linkage as it doesn't impose a hidden ABI on the client, only the one that is also part of the API. Typical exhibitionist class: // Foo.hpp #include class Foo { public: Foo(int); private: // how embarrassing! Dongle dongle; }; // Foo.cpp Foo(int bar) : dongle(bar) {} Now, it's developed a bad case of PIMPLs and decides to cover up: // Foo_PIMPL.hpp class Foo { public: // API stays the same... Foo(int); // with some unfortunate additions... ~Foo(); Foo(Foo const&); Foo &operator =(Foo const&); private: // but privates are nicely tucked away! struct Impl; Impl *impl; }; // Foo_PIMPL.cpp #include struct Foo::Impl { Dongle dongle; }; Foo(int bar) { impl = new Impl{Dongle{bar}}; // hmm... } ~Foo() { delete impl; // hmm... } Foo(Foo const&other) { // oh no } Foo &operator =(Foo const&other) { // I hate everything } There are a couple big caveats of PIMPL, and that's of course that you need to do dynamic memory allocation and suffer a level of pointer indirection, plus write a whole bunch of boilerplate! In this article I will propose something similar to PIMPL that does not require this sacrifice, and has (probably) no run time overhead compared to using standard private members. [subheading]Pop that PIMPL![/subheading] So, what can we do about it? Let's start by understanding why we need to put private fields in the header in the first place. In C++, every class can be a value type, i.e. allocated on the stack. In order to do this, we need to know its size, so that we can shift the stack pointer by the right amount. Every allocation, not just on the stack, also needs to be aware of possible alignment restrictions. Using an opaque pointer with dynamic allocation solves this problem, because the size and alignment needs of a pointer are well-defined, and only the implementation has to know about the size and alignment needs of the encapsulated fields. It just so happens that C++ already has a very useful feature to help us out: std::aligned_storage in the STL header. It takes two template parameters - a size and an alignment - and hands you back an unspecified structure that satisfies those requirements. What does this mean for us? Instead of having to dynamically allocate memory for our privates, we can simply alias with a field of this structure, as long as the size and alignment are compatible! [subheading]Implementation[/subheading] To that end, let's design a straightforward structure to handle all of this somewhat automagically. I initially modeled it to be used as a base class, but couldn't get the inheritance of the opaque Impl type to play well. So I'll stick to a compositional approach; the code ended up being cleaner anyways. First of all, we'll template it over the opaque type, a size and an alignment. The size and alignment will be forwarded directly to an aligned_storage. #include template struct Pimpl { typename std::aligned_storage::type mem; }; For convenience, we'll override the dereference operators to make it look almost like we're directly using the Impl structure. Impl &operator *() { return reinterpret_cast(mem); } Impl *operator ->() { return reinterpret_cast(&mem); } // be sure to add const versions as well! The last piece of the puzzle is to ensure that the user of the class actually provides a valid size and alignment, which ends up being quite trivial: Pimpl() { static_assert(sizeof(Impl)
  3. First thing is to boil it down a bit.  Reference counting is out of the scope of this - it's dealing with what happens when you do decide to delete something that is the issue at hand.   Consider a simplistic rendering loop that renders, presents, and then waits for the device to be idle each frame.  After the device is idle, but before anything is rendered, you can be sure that destroying a resource will be safe, as long as you don't try to use it again.  So you can either defer all logic that would ever delete a resource until that time, or allow a deletion to be requested at any time but put it in a buffer to be "played back" in the safe window.   Fairly trivially, this can be extended to a pipeline that is multiple frames deep.  Let's go with 3, for the sake of example.  This means you might have rendering commands on the fly up to 2 frames "behind", and therefore you have to take that into account whenever you want to delete a resource.  The easiest thing to do in this case IMO would be to have a growable array for each frame (e.g. 3); they keep track of deletion requests, each corresponding to requests issued when the CPU is processing that frame.  Each frame would also have a fence associated with it.  Whenever a new frame is started, it waits on its fence, and then before rendering anything, it deletes the appropriate resources and clears the vector for new requests to be added.  When submitting the commands for that frame, you say that the fence should be signaled when the submission is complete.  The fence ensures that the CPU will never get too far ahead of the GPU and cause the renderer to trip over itself, ensuring that you won't ever try to delete a resource that's being used by the GPU.   This concept of shifting by the number of frames in your pipeline applies to destructive updates, such as rendering to a framebuffer, as well.  For N frames, you will need N copies of each attachment, and cycle through to determine which copy you are writing to.
  4. Klutzershy

    Vulkan render pass questions

    It's not meant for post-processing in general.  It's meant for optimizations that tiled GPUs especially can make use of whenever they have guarantees like only reading an attachment at the exact position it's rendering the fragment.   I imagine merging everything into a single shader would have some downsides as opposed to doing it within the render pass framework, or we wouldn't have render passes in the first place.
  5. Klutzershy

    Vulkan is Next-Gen OpenGL

    Aha!  I finally managed to solve the problem!   I was reading this page more closely, and it turns out that the loader needs to be able to find the layers using registry keys that indicate the VkLayer_xxx.json manifests.  I added them and it all works perfectly now, even through MinGW instead of VS2015!
  6. Klutzershy

    Vulkan is Next-Gen OpenGL

    I did recompile all the loader and layer libraries from source using the VS 2015 compiler, actually, and am using the debug versions.  Though I guess since they're MSVC-generated a lot or all of the debugging information is only in the PDBs?  Maybe I should give up on trying to use MinGW for this...
  7. Klutzershy

    Vulkan is Next-Gen OpenGL

    Yes, the exit would always have happened after a pipeline flush.  I think the issue ended up being that the image was still owned by the presentation engine rather than the application when I tried to destroy the swapchain, but it was hard to tell for sure because even completely unrelated things seemed to fix it, like creating an already-signaled fence at initialization and then waiting on it before termination, or just stepping through a debugger and putting more time between the last vkQueuePresentKHR and vkDestroySwapchainKHR.   If I've learned anything from my efforts today, it's that Vulkan development, especially if you're learning by mostly trial-and-error, is next to impossible if you can't use the validation layers.  And with no existing knowledge base, it's trial-and-error to get the layers working in the first place...I have no idea if it's because I'm on Windows 7, or because I have AMD hardware, or because I use MinGW and not MSVC, or what.
  8. Klutzershy

    Vulkan is Next-Gen OpenGL

    I'm probably not doing myself any favours here by using MinGW-w64...   That being said, it's kind of irritating to require us to build your project using CMake (and download Python), but only support Visual Studio anyways.  Downloading and installing that beast is a huge investment just to compile a single library.   EDIT: Oh man, am I glad I found out you can get the VC++ build tools without downloading the whole kit and caboodle.   EDIT 2: This is ridiculously frustrating.  If I enable the debug layers, I get a segfault from vkResetFences, which I have not called explicitly.
  9. Klutzershy

    Vulkan is Next-Gen OpenGL

    So is compiling at least the validation layers unavoidable?  Although it's reported that all the layers are available, any time I try to load even one of them (not just the whole VK_LAYER_LUNARG_standard_validation), I get a segfault (from vkGetInstanceProcAddr with SDK, from somewhere in the Vulkan DLL with SDK   I'm kind of at a standstill, because my application started crashing on exit ever since I started clearing the swapchain images.  But of course, I can't validate anything to find out what's going on.
  10. Klutzershy

    LOD on a (cubical) voxel engine

    One option that is an alternative to dealing with mesh LOD is to hybridize and use a combination of polygonal rendering for near terrain and actual voxel rendering for distant terrain.     Since your terrain is already made of cubes, it will be harder to notice a difference on the boundary.
  11. Klutzershy

    Brain Dead Simple Game States

    @cdoty: Initialization may be done in the constructor and clean up in the destructor.  It's not a finite state machine - the states are created ad hoc just before use instead of kept around and simply switched to.  Of course, in a managed language like the article assumes, you would likely want to use a clean up method, because you have no idea when a destructor will be called by the GC.
  12. Klutzershy

    Vulkan is Next-Gen OpenGL

    Not to mention that you could expose the "low-hanging fruit" of Vulkan optimizations (command buffers, threading) without having the API client delve into memory management or manual buffering.
  13. Klutzershy

    Vulkan Resources

      It's a hell of a lot easier than setting up the infrastructure for double (or more) buffering, where you need to ensure that you don't change/delete things that are being used for the previous frame.  You still need to issue a wait every 2 frames for double buffering (3 for triple buffering, and so on), but in practice, it's just there as a safeguard for pathological scenarios and should typically be a no-op.
  14. Klutzershy

    Vulkan is Next-Gen OpenGL

      Can confirm, I can get the samples running with the driver update on my R9 390.
  15. Klutzershy

    Vulkan is Next-Gen OpenGL

      Yep, that works fine.  Says my API version is 0.0.1, perhaps that has something to do with it?
  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!