Jump to content
  • Advertisement


  • Content Count

  • Joined

  • Last visited

Community Reputation

926 Good

About ZachBethel

  • Rank

Personal Information

  • Role
  • Interests

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. Quick question Hodgman, in order to support the simple <8 bit handles, are you using some kind of aggregating resource database structure that sits alongside a set of draw items? Is that just a set of arrays (like, are the handles just indices into that local array?). I seem to remember looking at some code you'd published a while back and it seemed like you were aggregating all the draw items / resource aggregation into a single data stream.
  2. ZachBethel

    dx11 api dx12 implementation

    Is this what you were looking for? https://github.com/Microsoft/DirectX-Graphics-Samples/tree/master/Samples/Desktop/D3D1211On12
  3. I have a rather specific question. I'm trying to learn about linked multi GPU in Vulkan 1.1; the only real source I can find (other than the spec itself) is the following video: Anyway, each node in the linked configuration gets its own internal heap pointer. You can swizzle the node mask to your liking to make one node pull from another's memory. However, the only way to perform the "swizzling" is to rebind a new VkImage / VkBuffer instance to the same VkDeviceMemory handle (but with a different node configuration). This is effectively aliasing the memory between two instances with identical properties. I'm curious whether this configuration requires special barriers. How do image barriers work in this case? Does a layout transition on one alias automatically affect the other. I'm coming from DX12 land where placed resources require custom aliasing barriers, and each placed resource has its own independent state. It seems like Vulkan functions a bit differently. Thanks.
  4. Hey all, I'm looking into building a streaming system for mipped images. I'm referencing the DirectX sample for memory management here: https://github.com/Microsoft/DirectX-Graphics-Samples/tree/master/TechniqueDemos/D3D12MemoryManagement I have a couple related questions to this. I'm leaning towards also utilizing tiled resources for mips, mainly because it allows me to avoid invalidating my pre-cooked descriptor tables any time an image updates, since I would effectively have to create a new ID3D12Resource with more / fewer mip levels when a stream / trim event occurs, respectively. Has anyone had success using tiled resources or noticed any potential performance impact related to having the page table indirection? Also, I noticed that tiled resource tier 1 doesn't support mip clamping. Are there workarounds (in the shader, for example), or limiting the mip level in cases where we don't have a mip resident? Or am I required to create a new view mapped to the resident subset. This would also require that I rebake my descriptor tables, which I would like to avoid. My second question is how to handle the actual updates. I would like to utilize a copy queue to stream contents up to the GPU. I have a couple approaches here: Create a device-local staging image and run my async copy job to upload to it. This happens in parallel with the current frame using the existing image. At the beginning of the next frame (on the graphics queue) I blit from the staging memory to the newly resident mip, and then use the full mip chain for rendering. Utilize sub-resource transitions to put part of the image into an SRV state and the other part into a Copy Destination state. The async copy queue uploads to the more-detailed mip levels while the current frame renders using the SRV subresources. This approach seems a bit more complicated due to having to manage sub-resource transitions, but it avoids a copy in the process. My question here is whether I need to specify the D3D12_RESOURCE_FLAG_ALLOW_SIMULTANEOUS_ACCESS bit on my resource, even though the transitions and accesses are occurring between different sub-resources. If so, do you know what kind of performance repercussions I could expect from this? Would I still be able to store my images in BCn formats, for example? Thanks much, Zach.
  5. ZachBethel

    Implicit State Promotion

    Gotcha, that's what I was afraid of. Thanks for asking around!
  6. ZachBethel

    Implicit State Promotion

    I figured I would give an update in case somebody else has similar confusion. I was able to convert most of my resource transition logic to utilize state promotion and decay. There are some caveats that are mentioned in the docs but the implications weren't quite clear to me. Buffers and read-only images always decay to the common state after an ExecuteCommandLists call. This means you effectively don't need to track them across queues, which is really nice. Likewise, the first access of the above resource types is implicit and does not require a transition. This means if you are using a copy queue, you don't need to do any barriers at all. However, any subsequent transitions from that first usage within an ExecuteCommandLists scope requires a transition from the implicitly used state to the new state. For example, if I first use my buffer as a copy destination, copy to it, and then attempt to use it as a vertex buffer within the same command list batch, I have to transition it first from CopyDest -> VertexAndConstantBuffer. That part wasn't clear to me initially. Any image that is written to by the GPU, either by UAV / RTV / DSV access, does not participate in state promotion / decay, unless it has the D3D12_RESOURCE_FLAG_SIMULTANEOUS_ACCESS bit. This makes sense, since the image is likely compressed and must now be meticulously tracked. Anyway, like I said, this is all dictated in the docs, but some of the bits weren't clear to me at first.
  7. ZachBethel

    Implicit State Promotion

    Would you guys happen to know whether something like this is available on Vulkan as well? I know the pipeline barrier model is a bit different. I'm trying to determine whether I can forgo some of the barrier hand-off logic for Graphics -> Copy -> Graphics scenarios. My concern would be whether I need to explicitly transition to a copy dest layout.
  8. Hey all, I'm trying to understand implicit state promotion for directx 12 as well as its intended use case. https://msdn.microsoft.com/en-us/library/windows/desktop/dn899226(v=vs.85).aspx#implicit_state_transitions I'm attempting to utilize copy queues and finding that there's a lot of book-keeping I need to do to first "pre-transition" from my Graphics / Compute Read-Only state (P-SRV | NP-SRV) to Common, Common to Copy Dest, perform the copy on the copy command list, transition back to common, and then find another graphics command list to do the final Common -> (P-SRV | NP-SRV) again. With state promotion, it would seem that I can 'nix the Common -> Copy Dest, Copy Dest -> Common bits on the copy queue easily enough, but I'm curious whether I could just keep all of my "read-only" buffers and images in the common state and effectively not perform any barriers at all. This seems to be encouraged by the docs, but I'm not sure I fully understand the implications. Does this sound right? Thanks.
  9. ZachBethel


    That's what I thought. The solution I went with is to keep a map of image descriptor hash to resource allocation info. It cut down on the cost by 3x. Thanks!
  10. Hey, I'm working on a placed resource system, and I need a way to determine the size and alignement of image resources before placing them on the heap. This is used for transient resources within a frame. The appropriate method on ID3D12Device is GetResourceAllocationInfo. Unfortunately, this method is quite slow and eats up a pretty significant chunk of time. Way more than I would expect for just returning a size and alignment (I'm using a single D3D12_RESOURCE_DESC) each time. Is there a way I can conservatively estimate this value for certain texture resources (i.e. ones without mip chains or something)? Thanks.
  11. Yeah, I was mistaken. Ibelieve I was confused by the fact that most hardware typically has a single hardware graphics queue. At any rate, the issue is that I was querying GetCompletedValue on my fences at a time when I thought all previous work would have completed. This was not the case.
  12. Hey all, I'm trying to debug some async compute synchronization issues. I've found that if I force all command lists to run through a single ID3D12CommandQueue instance, everything is fine. However, if I create two DIRECT queue instances, and feed my "compute" work into the second direct queue, I start seeing the issues again. I'm not fencing between the two queues at all because they are both direct. According to the docs, it seems as though command lists should serialize properly between the two instances of the direct queue because they are of the same queue class. Another note is that I am feeding command lists to the queues on an async thread, but it's the same thread for both queues, so the work should be serialized properly. Anything obvious I might be missing here? Thanks!
  13. I'm reading through the Microsoft docs trying to understand how to properly utilize aliasing barriers to alias resources properly. "Applications must activate a resource with an aliasing barrier on a command list, by passing the resource in D3D12_RESOURCE_ALIASING_BARRIER::pResourceAfter. pResourceBefore can be left NULL during an activation. All resources that share physical memory with the activated resource now become inactive or somewhat inactive, which includes overlapping placed and reserved resources." If I understand correctly, it's not necessary to actually provide the pResourceBefore* for each overlapping resource, as the driver will iterate the pages and invalidate resources for you. This is the Simple Model. The Advanced Model is different: Advanced Model The active/ inactive abstraction can be ignored and the following lower-level rules must be honored, instead: An aliasing barrier must be between two different GPU resource accesses of the same physical memory, as long as those accesses are within the same ExecuteCommandLists call. The first rendering operation to certain types of aliased resource must still be an initialization, just like the Simple Model. I'm confused because it looks like, in the Advanced Model, I'm expected to declare pResourceBefore* for every resource which overlaps pResourceAfter* (so I'd have to submit N aliasing barriers). Is the idea here that the driver can either do it for you (null pResourceBefore) or you can do it yourself? (specify every overlapping resource instead)? That seems like the tradeoff here. It would be nice if I can just "activate" resources with AliasingBarrier (NULL, activatingResource) and not worry about tracking deactivations. Am I understanding the docs correctly? Thanks.
  14. ZachBethel

    A retrospective on the Infinity project

    Please tell me you're not going to be the only engineer on this. That just isn't working out for you, as brilliant as you are. ;)
  15. Is it valid behavior to map a region of a read back resource while simultaneously writing to a disjoint region via the GPU? I've got a profiler subsystem with a single read back buffer that is N times the size of my query heap for N frames. The debug SDK layer gives a warning that the subresource is mapped while writing from the GPU.
  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!