Jump to content
  • Advertisement
Sign in to follow this  
Maddius

[DX10] Pix & Update Subresource

This topic is 2775 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi all,

I have been trying to identify a bottle neck in my DirectX10 application using Pix. The problem arises when the rendering resolution is set relatively high (i.e. greater than 1680x1050 for example) The problem can be identified as a noticable stutter that comes and goes while rendering the scene. The scene itself is not that complex, roughly 48 DIP calls and all meshes are less than 100 polys.

I have attached the pix outputs in the help that someone might have an idea. It seems to be when 'UpdateSubresource' is called, all though what resource is being updated I am not sure. I have optimised my constant buffers with the theory that they were choking the shader, but that has not helped. I have also ensured that all textures are in the .dds format with the relevant compression and mip mapping on them.

The issue is only noticable when the rendering resoltion is increased. I know my graphics card can handle those resolutions because I run games at those resolutions all the time, so maybe is a fill rate / bandwidth issue? In terms of drop of frame rate, it goes from 60FPS to around 40-45FPS.

Any help would be great,

Thanks in advance,

Maddius.


pixobjs.jpg

Share this post


Link to post
Share on other sites
Advertisement
It's not something you can display in an image, it's just a stuttering of frames, i.e. going from 60 fps solid to 45 fps and then back again. When it should remain at a solid 60.

Share this post


Link to post
Share on other sites
When the scene is rendering at 60 FPS the update subresource call you mentioned is not called?

Maybe your FPS drop because more pixels are being shaded (or with more complex shaders)... (some images would help...)

Also I advice you to install nVidia PerfHUD or ATI GPU Perf Studio, because they help you identify bottlenecks... you can see the vertex/pixel shader usage of each draw call, etc Edited by TiagoCosta

Share this post


Link to post
Share on other sites
I have tried the ATI GPU PerfStudio, but whenever I connect to my application, all the performance monitoring tools are greyed out! Any ideas?

Thanks.

Share this post


Link to post
Share on other sites

Hi all,

I have been trying to identify a bottle neck in my DirectX10 application using Pix. The problem arises when the rendering resolution is set relatively high (i.e. greater than 1680x1050 for example) The problem can be identified as a noticable stutter that comes and goes while rendering the scene. The scene itself is not that complex, roughly 48 DIP calls and all meshes are less than 100 polys.


do you render with antialiasing? what gpu do you have? how much memory is provided to it?



I have attached the pix outputs in the help that someone might have an idea. It seems to be when 'UpdateSubresource' is called, all though what resource is being updated I am not sure. I have optimised my constant buffers with the theory that they were choking the shader, but that has not helped. I have also ensured that all textures are in the .dds format with the relevant compression and mip mapping on them.

The issue is only noticable when the rendering resoltion is increased. I know my graphics card can handle those resolutions because I run games at those resolutions all the time, so maybe is a fill rate / bandwidth issue? In terms of drop of frame rate, it goes from 60FPS to around 40-45FPS.
[/quote]if updatesubresource is stalling, it means that the driver/direct3d cannot buffer your update request and needs to update the resource inplace. if the resource is still in-use, it needs to wait till the gpu is done processing it before overwriting its data.

the reason you don't see that issue with lower resolution might be that your gpu is way faster done with the work. NO it doesn't mean your cpu or gpu can't handle the resolution, it just means, that at first the gpu is idle, waiting to receive commands, then the cpu is idle waiting for the gpu to 'release' resources and then again the gpu is idle.





so, a simple solutoin would be to use more resources (double or quadbuffering them), calling explicit your "map" command and updating those. that might keep both, cpu+gpu running with no stalls.




at least my 2cent ;)

Share this post


Link to post
Share on other sites
I have attached information from DxDiag about my graphics hardware. I am also running two monitors, but that doesn't seem to make a difference. No antialiasing is done. I think my problem is understanding what resource is stalling, it is seemingly stalling with some constant buffers, is there a particular method to how they should be updated to improve performance, other than grouping them based on frequency of use, putting them in order of use and keeping their overall number to a minimum?

Seemingly, the stall occurs while setting this buffer for an object within the scene:


//GPU

cbuffer cbGloalPerObject
{
float4x4 gWorldMatrix;
float4x4 gWorldViewProjMatrix;
float4x4 gTextureMatrix;
float4 gColourModifier;
float gInterpolation;
float gAlphaModulus;
float gSpecularity;
float gShininess;
};


In order to set these variables, I first get them by name when the shader is compiled, save that to a variable and then use that once per object to set the variables within the constant buffer. So if we take the world matrix for example...


// CPU
// At Initialise

nShaderVars[pShaderDesc.mType].mWorldMatrixFX = nShaders[pShaderDesc.mType].AddMatrixVar("gWorldMatrix");

...

// When each object wants to set it.

nShaderVars[nCurrentPacket.mShaderType].mWorldMatrixFX->SetMatrix((float*)&pMatrix);


It's an odd stall because this constant buffer is set for every object that is a mesh and does not stall for them. How can I do the double or quadbuffering? Could you provide and example of what you mean? I hope this information can push the issue forward, let me know if you need anything else,

Thanks.

=========

---------------
Display Devices
---------------
Card name: ATI Mobility Radeon HD 5600/5700 Series
Manufacturer: ATI Technologies Inc.
Chip type: ATI display adapter (0x68C1)
DAC type: Internal DAC(400MHz)
Device Key: Enum\PCI\VEN_1002&DEV_68C1&SUBSYS_1449103C&REV_00
Display Memory: 2773 MB
Dedicated Memory: 1014 MB
Shared Memory: 1758 MB
Current Mode: 1600 x 900 (32 bit) (60Hz)
Monitor Name: Generic PnP Monitor
Monitor Model: unknown
Monitor Id: LGD027A
Native Mode: 1600 x 900(p) (60.080Hz)
Output Type: Internal
Driver Name: aticfx64.dll,aticfx64.dll,aticfx64.dll,aticfx32,aticfx32,aticfx32,atiumd64.dll,atidxx64.dll,atidxx64.dll,atiumdag,atidxx32,atidxx32,atiumdva,atiumd6a.cap,atitmm64.dll
Driver File Version: 8.17.0010.1041 (English)
Driver Version: 8.762.0.0
DDI Version: 11
Driver Model: WDDM 1.1
Driver Attributes: Final Retail
Driver Date/Size: 8/4/2010 02:54:00, 598528 bytes
WHQL Logo'd: Yes
WHQL Date Stamp:

Share this post


Link to post
Share on other sites

I have tried the ATI GPU PerfStudio, but whenever I connect to my application, all the performance monitoring tools are greyed out! Any ideas?

Thanks.


You have to hit the pause button first, and then you can go into the frame debugger or frame analyzer.

Share this post


Link to post
Share on other sites

--------------
Display Devices
---------------
Card name: ATI Mobility Radeon HD 5600/5700 Series
Manufacturer: ATI Technologies Inc.
Chip type: ATI display adapter (0x68C1)
DAC type: Internal DAC(400MHz)
Device Key: Enum\PCI\VEN_1002&DEV_68C1&SUBSYS_1449103C&REV_00
Display Memory: 2773 MB
Dedicated Memory: 1014 MB
Shared Memory: 1758 MB
Current Mode: 1600 x 900 (32 bit) (60Hz)
Monitor Name: Generic PnP Monitor
Monitor Model: unknown
Monitor Id: LGD027A
Native Mode: 1600 x 900(p) (60.080Hz)
Output Type: Internal
Driver Name: aticfx64.dll,aticfx64.dll,aticfx64.dll,aticfx32,aticfx32,aticfx32,atiumd64.dll,atidxx64.dll,atidxx64.dll,atiumdag,atidxx32,atidxx32,atiumdva,atiumd6a.cap,atitmm64.dll
Driver File Version: 8.17.0010.1041 (English)
Driver Version: 8.762.0.0
DDI Version: 11
Driver Model: WDDM 1.1
Driver Attributes: Final Retail
Driver Date/Size: 8/4/2010 02:54:00, 598528 bytes
WHQL Logo'd: Yes
WHQL Date Stamp:


looks like you should have enough memory (just wanted to be sure you don't run some intel onboard gpu with 128MB or something).


It's an odd stall because this constant buffer is set for every object that is a mesh and does not stall for them.[/quote]

so you can imagine, that D3D/driver needs to buffer those and might run out of buffers, if you reuse the same constant buffer all the time.




How can I do the double or quadbuffering? Could you provide and example of what you mean? [/quote]

in the place where you create the constantbuffers, you just create two (or 4) of them, and everytime you need to set a constantbuffer, you take the next of those two or 4.




you should also check if all of your variables in the constantbuffer are per objects, some of them might be shared across several objects/drawcalls, then you should split those into a seperate constantbuffer. This way you lower the pressure on the buffer that d3d/driver keeps in the background.

it might also make sense to move some work to the shader, I see you have a worldviewprojection matrix, you could split that into one per frame viewproj matrix and per object world matrix. the world matrix-constantbuffer would just need to be updated, if some object moves (it's common case for games, that most objects are static). and the view matrix would just need one update per frame. yes, you would waste some compute time on GPU side, but 3 more dot products won't kill performance and it's also common that you need the world position as well es the final worldviewproj positions, so you multiply anyway by two matrices, so this would be just a tiny change (instead of mul(float4(in.pos,1.f),worldviewproj) you'd use the previously calculated world positoins (mul(float4(worldpos,1.f),viewproj); and it would have no additional costs.




cheers

Share this post


Link to post
Share on other sites

in the place where you create the constantbuffers, you just create two (or 4) of them, and everytime you need to set a constantbuffer, you take the next of those two or 4.
[/quote]

I am following you in all ways apart from this sentence, I dont know what you mean by setting the constant buffers because to set the variables within the constant buffers I just use the ID3D10EffectVariable* that I have for each variable. Is this not the way to do it? I have managed to get the profiling to work, so hopefully I will get to grips with that to provide some more insights.

Cheers,

Maddius.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!