Jump to content
  • Advertisement


  • Content Count

  • Joined

  • Last visited

  • Days Won


pcmaster last won the day on October 9

pcmaster had the most liked content!

Community Reputation

1109 Excellent


About pcmaster

  • Rank

Personal Information

  • Role
  • Interests

Recent Profile Visitors

9922 profile views
  1. Why not simply try out? That i7 (Acer) will be immensely faster than Core2 Duo, which is good for compilation. Also the RAM is bigger.
  2. Hi all, is it only me? I can't even copy out the text of your sources correctly in Windows 7 Firefox x64 into Notepad++. E.g. when I copy m_Effect.CurrentTechnique = m_Effect.Techniques["Textured"]; I actually get some weird characters, especially the "r" in "Textured" isn't ASCII 0x72 but the bytes 0x EF BB BF 72 instead, which looks like the UTF-8 BOM. Similarly for the "d" in the same word. Furthermore, when I hit "Ctrl+F" and try to find all occurrences of "Textured" in this website, it doesn't find the one in the C++ code nor in the HLSL. Anyway, do the other calls work correctly, if you comment out the query for TerrainTypeSampler, for example this one works fine? m_Effect.Parameters["GrassSampler"].SetValue(m_Grass); My I suggest you save your all your sources are plain ASCII (not UTF-8) and try to recompile and re-run? EDIT: Copying stopped yielding weird characters after refreshing the web, also my Firefox is fine again, WTF have I just experienced?
  3. Interestingly enough, I haven't found any proof that ID3D12CommandQueue be thread-safe, so I wouldn't count on it (links, anyone?). Put a mutex around it and you're safe - you won't submit that many command lists during a frame for this to block you (that goes for everything I wrote -- if it's only a few dozen a frame, fǜck it and use a mutex; if it's thousands, go lock-less). Optimise only after measuring. Don't optimise where you can't gain almost anything. Also it makes sense only if you don't care about their order. If they depend one on another and form a directed tree graph, you'll have to order them yourself, but you're surely aware. I didn't know you wouldn't call your Initialize every frame. If it's only at the app start, cool
  4. Maybe I'm combining too many things, if still unclear, we can break it down a bit.
  5. Think about it - nicely looping over a few hundred little pointers every frame on the CPU... that's NOTHING compared to synchronising with the GPU. Once a fence after a frame is passed, each and every resource the GPU touched can be recycled safely. Obviously. In the meantime, record commands into an alternate set of buffers and with different resources (ping-pong, double-buffer, round-robin, ...) as discussed, the pseudo-code you posted is fine. You could have a pool of upload buffers. The thread asks the pool and gets a fresh and safe resource to work with. It gets 'marked' with the current frame counter when handed to the caller. Once its frame's fence signal has been seen some time in the future, all resources with that frame's counter (or older) are good to be reused. You don't even have to "return" them back to the pool. The only time the pool would BLOCK the caller is if it ran out of resources - but that should never happen if you inflate it enough. You won't need to iterate anything if you think about it. The pool is more a ring-buffer, really. The same mechanism can be used for constant buffers if you assemble them somehow dynamically.
  6. Does the upload buffer need to be released ASAP? Short on memory? Why not in 1-2 frames with everything else from the past frame(s)? Less sync, better life, do yourself the favour :)
  7. Why would a SubSystem want to wait for a command list? I smell you're going to read something back immediately? :)
  8. Simply don't make yourself synchronise the CPU threads with the GPU at all! Only the one 'main' thread. Instead, as mentioned by MJP, double-(or triple-)buffer your command lists (and allocators). It isn't such a terrible amount of memory that you'd have to "ration" it. Your subsystems will assume that the resources they are handed are safe to CPU-write and GPU isn't touching them. Also, do NOT use the same command allocator (or any allocator for that matter) on two different threads. Recording commands is a pretty 'intense' operation and if several threads do it at once, they compete for the mutex (or at least an atomic) that's making the allocator thread-safe. Just don't do it. Hand each thread its own allocator (from a pool or double-buffer at least). [OT]If you really need one allocator to serve multiple threads, then for your own hand-written allocators, don't enter the critical section every time for every tiny allocation - give threads only bigger chunks which they'll be using for some time and only occasionally contend for the 'mutex' when their chunk is depleted. Heh, now that I read what I wrote, you still end up with a kinda allocator per-thread anyway ID3D12CommandAllocator isn't your hand-written allocator though, it will have a mutex inside. I reckon it's thread-unfriendly. [/OT]
  9. Yours looks flipped on the Y axis (but not on X). Posting source might help :)
  10. Nice, I didn't really know about the exact usage of D3D12_RESOURCE_FLAG_ALLOW_SIMULTANEOUS_ACCESS, so sorry for mystifying :)
  11. Then yes, you'll have to insert two resource barriers: ResourceBarrier(defaultHeapTexture, STATE_PIXEL_SHADER_RESOURCE -> STATE_COPY_DEST) CopyTextureRegion(uploadHeapTexture -> defaultHeapTexture) ResourceBarrier(defaultHeapTexture, STATE_COPY_DEST -> STATE_PIXEL_SHADER_RESOURCE) Draw(using defaultHeapTexture) I think that this doesn't interfere with the CPU memcpy you'll do to the 'staging' texture in the upload heap.
  12. Somebody more versed with DX12 could shed some light on uploading to the default heap. I'm getting a bit lost What will the CopyTextureRegion copy from and to where?
  13. No, you can't. Unordered access means UAV (unordered access view) which maps to HLSL RWTexture2D, for example. Just like SRV (shader resource view) maps to HLSL Texture2D, for example. Neither has anything to do with the CPU. These states are about GPU-to-GPU synchronisation. Not CPU-to-GPU. CopyTextureRegion is also a GPU-to-GPU operation - the GPU will at some later unspecified time (when it gets to processing the "CopyTextureRegion" command) copy from some GPU-visible resource to some other GPU-visible resource. There is no CPU involved whatsoever. Uploading from CPU is done differently, you have to use the upload heaps.
  14. I don't think you want to use D3D12_RESOURCE_FLAG_ALLOW_SIMULTANEOUS_ACCESS, that's intended for multi-device or at least multi-queue scenarios. I think you only have only a single queue, right? All command lists executed with that queue get serialised. Transitioning a resource to D3D12_RESOURCE_STATE_UNORDERED_ACCESS means that you'll be able to use it as a RWTexture/RWBuffer/etc and the correct caches will be flushed for you, based on what the state was previously. Similarly with all the other transitions.
  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!