• Content count

  • Joined

  • Last visited

Community Reputation

107 Neutral

About matmuze

  • Rank
  1. Hi Folks, I am developing a 3D app with DX11 and I was wondering if there was a nice little tool out there that would display a simple HUD with the list of all draw calls and their timing. Kind of like the old PerfHub or the live GPU profiler in Unreal: Something very simple that could run continuously without hurting the perf too much... Thanks in advance.      
  2. Ok, it seems that you guys did not get the general gist of my question, so I'm going to make it simpler.   Let's take one million triangle for instance, in both cases the scene is drawn in a single draw call... Case 1:  MultiDrawInstanced with 100.000 draws of triangle size 10 Case 2: DrawInstanced with 100.000 instances of triangle size 10   Is case 1 as fast as case 2 ? My assumption is that because the draws are performed in serial in case 1 there must be a bit of synchronisation done between each draws, whereas in case 2 everything can be batched together so no need to synchronise the batches...
  3. Hi Folks, I am currently designing a new rendering pipeline and I have a serious question about the mechanism behind MultiDrawIndirect/ExecuteIndirect... Say that, for the purpose of efficient culling, I split all my geometries into batches and store these batches into a large buffer containing N batches. A batch simply contains a pointer to the triangle StartIndex and an integer defining the triangle count (there can be from 64 to 512 triangles per batch). Let's assume that I want to use MultiDrawIndirect to draw these N batches in a single draw call... The draw arguments buffer would thus look somewhat like that: ----- batch 0 ----- vertex count: 89 start index: 0 ----- batch 1 ----- vertex count: 394 start index: 89 ----- batch 2 ----- vertex count: 145 start index: 483 ... Will the graphics hardware execute these draw calls in serial, e.g. draw batch 0 first then batch 1, etc... In case the batches are executed in serial (batch 0 must be finalised before batch 1) then I reckon that given the size of the batches, some of my SM (streaming multiprocessor) will remain idle the whole time...   That's my assumption ATM so an alternative I came up with, is to use a single DrawIndirect (one vertex shader per batch) and using the tessellation shader to dynamically inject the geometries of the batches, that way all the batches will be execute simultaneously and my SM occupancy will always be 100% all the time. I know that the tessellation shader has an certain overhead but maybe that would still be worth it if that could mean full SM usage...   Is there any GPU gurus out there that could tell me if I am too far off from reality ?