Sign in to follow this  

DX12 Query GPU usage?

Recommended Posts

Hey Guys,


I have a very simple profile system in my little dx12 engine which can basically visualize time spent on GPU for each task by using timestamp. This is a good way to tell whether we got GPU bubbles or not, or identify suspicious time consuming passes . But the problem is that we can't tell the GPU usage, for example my 'fancy' postprocess pass my be bandwidth limited, or my compute shader maybe register limited which all could cause very little gpu usage that cannot be clearly reflected by using timestamp. So I really hope to be able to visualize the gpu usage thus I can tell whether the GPU is fully saturated or not, and then could be able to do better optimization. 


And I believe being able to visualize gpu usage per task is an very important way to place your async compute shader wisely..


So it will be greatly appreciated if someone could enlightening me on that





Share this post

Link to post
Share on other sites

Thanks Hodgman,

Could you recommend some of those external profilers which are capable of showing all those metrics you mentioned before? I've seen for xbox one we have powerful PIX which can do real-time monitor, and offline capture analysis, but it just for xbox one though.... For desktop, I heard about GPUView, RenderDoc, Nsight, but it seems they can't give all those info as we mentioned before, so it will be greatly appreciated if you could name one or two of those profiler I can start with :-)



Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Forum Statistics

    • Total Topics
    • Total Posts
  • Similar Content

    • By lubbe75
      I am looking for some example projects and tutorials using sharpDX, in particular DX12 examples using sharpDX. I have only found a few. Among them the porting of Microsoft's D3D12 Hello World examples (, and Johan Falk's tutorials (
      For instance, I would like to see an example how to use multisampling, and debugging using sharpDX DX12.
      Let me know if you have any useful examples.
    • By lubbe75
      I'm writing a 3D engine using SharpDX and DX12. It takes a handle to a System.Windows.Forms.Control for drawing onto. This handle is used when creating the swapchain (it's set as the OutputHandle in the SwapChainDescription). 
      After rendering I want to give up this control to another renderer (for instance a GDI renderer), so I dispose various objects, among them the swapchain. However, no other renderer seem to be able to draw on this control after my DX12 renderer has used it. I see no exceptions or strange behaviour when debugging the other renderers trying to draw, except that nothing gets drawn to the area. If I then switch back to my DX12 renderer it can still draw to the control, but no other renderers seem to be able to. If I don't use my DX12 renderer, then I am able to switch between other renderers with no problem. My DX12 renderer is clearly messing up something in the control somehow, but what could I be doing wrong with just SharpDX calls? I read a tip about not disposing when in fullscreen mode, but I don't use fullscreen so it can't be that.
      Anyway, my question is, how do I properly release this handle to my control so that others can draw to it later? Disposing things doesn't seem to be enough.
    • By Tubby94
      I'm currently learning how to store multiple objects in a single vertex buffer for efficiency reasons. So far I have a cube and pyramid rendered using ID3D12GraphicsCommandList::DrawIndexedInstanced; but when the screen is drawn, I can't see the pyramid because it is drawn inside the cube. I'm told to "Use the world transformation matrix so that the box and pyramid are disjoint in world space".
      Can anyone give insight on how this is accomplished? 
           First I init the verts in Local Space
      std::array<VPosData, 13> vertices =     {         //Cube         VPosData({ XMFLOAT3(-1.0f, -1.0f, -1.0f) }),         VPosData({ XMFLOAT3(-1.0f, +1.0f, -1.0f) }),         VPosData({ XMFLOAT3(+1.0f, +1.0f, -1.0f) }),         VPosData({ XMFLOAT3(+1.0f, -1.0f, -1.0f) }),         VPosData({ XMFLOAT3(-1.0f, -1.0f, +1.0f) }),         VPosData({ XMFLOAT3(-1.0f, +1.0f, +1.0f) }),         VPosData({ XMFLOAT3(+1.0f, +1.0f, +1.0f) }),         VPosData({ XMFLOAT3(+1.0f, -1.0f, +1.0f) }),         //Pyramid         VPosData({ XMFLOAT3(-1.0f, -1.0f, -1.0f) }),         VPosData({ XMFLOAT3(-1.0f, -1.0f, +1.0f) }),         VPosData({ XMFLOAT3(+1.0f, -1.0f, -1.0f) }),         VPosData({ XMFLOAT3(+1.0f, -1.0f, +1.0f) }),         VPosData({ XMFLOAT3(0.0f,  +1.0f, 0.0f) }) } Then  data is stored into a container so sub meshes can be drawn individually
      SubmeshGeometry submesh; submesh.IndexCount = (UINT)indices.size(); submesh.StartIndexLocation = 0; submesh.BaseVertexLocation = 0; SubmeshGeometry pyramid; pyramid.IndexCount = (UINT)indices.size(); pyramid.StartIndexLocation = 36; pyramid.BaseVertexLocation = 8; mBoxGeo->DrawArgs["box"] = submesh; mBoxGeo->DrawArgs["pyramid"] = pyramid;  
      Objects are drawn
      mCommandList->DrawIndexedInstanced( mBoxGeo->DrawArgs["box"].IndexCount, 1, 0, 0, 0); mCommandList->DrawIndexedInstanced( mBoxGeo->DrawArgs["pyramid"].IndexCount, 1, 36, 8, 0);  
      Vertex Shader
      cbuffer cbPerObject : register(b0) { float4x4 gWorldViewProj; }; struct VertexIn { float3 PosL : POSITION; float4 Color : COLOR; }; struct VertexOut { float4 PosH : SV_POSITION; float4 Color : COLOR; }; VertexOut VS(VertexIn vin) { VertexOut vout; // Transform to homogeneous clip space. vout.PosH = mul(float4(vin.PosL, 1.0f), gWorldViewProj); // Just pass vertex color into the pixel shader. vout.Color = vin.Color; return vout; } float4 PS(VertexOut pin) : SV_Target { return pin.Color; }  

    • By mark_braga
      I am confused why this code works because the lights array is not 16 bytes aligned.
      struct Light {     float4 position;     float radius;     float intensity; // How does this work without adding // uint _pad0, _pad1; }; cbuffer lightData : register(b0) {     uint lightCount;     uint _pad0;     uint _pad1;     uint _pad2; // Shouldn't the shader be not able to read the second element in the light struct // Because after float intensity, we need 8 more bytes to make it 16 byte aligned?     Light lights[NUM_LIGHTS]; } This has erased everything I thought I knew about constant buffer alignment. Any explanation will help clear my head.
      Thank you
    • By HD86
      I don't know in advance the total number of textures my app will be using. I wanted to use this approach but it turned out to be impractical because D3D11 hardware may not allow binding more than 128 SRVs to the shaders. Next I decided to keep all the texture SRV's in a default heap that is invisible to the shaders, and when I need to render a texture I would copy its SRV from the invisible heap to another heap that is bound to the pixel shader, but this also seems impractical because ID3D12Device::CopyDescriptorsSimple cannot be used in a command list. It executes immediately when it is called. I would need to close, execute and reset the command list every time I need to switch the texture.
      What is the correct way to do this?
  • Popular Now