Sign in to follow this  
SKGenius

Effect vs Fixed Function Pipeline, Massive Performance Drop

Recommended Posts

SKGenius    122
I have a fairly simple scene and a fairly simple shader. When rendering without the shader I get around 1100 fps, when I use the shader it drops to around 350 fps.
I narrowed the problem down to the SetValue calls, for each different material than needs to be rendered I set the value of the diffuse and specular colours and whatnot as well as the texture.

If I only set the texture for each object and leave the material properties alone then it goes at around 1200 fps. I would put it purely down to the SetValue calls but I'm sure I had a similar setup a while ago that ran at about the same speed as the FFP. I also only call SetValue if the new material property is different from the previous, but that doesn't actually make much difference compared to blindly setting the values every time.

Is there something I could have missed somewhere that would cause this huge drop when settings the material colours to the effect?

Share this post


Link to post
Share on other sites
Washu    7829
... So you're complaining that your frame time went from .9ms/frame to 2ms a frame?

Hint, frame rate is logarithmic to actual time taken.

Come back when you're hitting 60 frames a second after running at 100.

Share this post


Link to post
Share on other sites
SKGenius    122
[font="arial, verdana, tahoma, sans-serif"][size="2"]3x slower for the same amount if work still isn't fantastic (1.9ms frame difference). If I render more stuff then when the FFP gets 280fps the shader gets 120 fps (4.75ms frame difference).
Working up from that they seem to both become equal when you get to around 150fps with both the FFP and shaders performing the same.

[/size][/font]
So what's actually going on in there?

Also, at the worst case point (280fps through FFP) if I throw in a pile of arbitrary matrix multiplications the framerate drops to 98 fps, but the shader can only manage 62 fps. So there's your 100-60 fps drop.

Share this post


Link to post
Share on other sites
Washu    7829
Well, without seeing what you're doing here are a few suggestions:
1. Batch
2. Make sure you're using the handle functions for your variables, and not the ones that take the name. Cache the handle as well.
3. BATCH

Share this post


Link to post
Share on other sites
SKGenius    122
These are just static vertex and index buffers, they can't be batched any better than they already are. And I'm already using the handles for the parameters.
In most cases I don't think I'll actually be having any problems, but in the situation I highlighted above where you not doing vast amounts of rendering but still quite a bit it can really eat in to your frame time.

It seems bizarre that there should be such a difference, at any level. The only thing I can think of is that the D3DXEffect needs to do some extra synchronisation which becomes obvious at higher frame rates, but I'm hoping that somebody with some in-depth knowledge of Direct3D can shed some extra light on the problem.

Share this post


Link to post
Share on other sites
Semei    123
[quote name='Washu' timestamp='1300225684' post='4786215']
Well, without seeing what you're doing here are a few suggestions:
1. Batch
2. Make sure you're using the handle functions for your variables, and not the ones that take the name. Cache the handle as well.
3. BATCH
[/quote]

Dont batch. Live EVER. Unless you have like 5000 draw calls or 20000 SetX.. Calls or that batch is very trivial. Just write simple nice code, thats not to fanatic about "not giving a damn" and have nice balance.

How many objects are you drawing? Whats your GPU? Have you updated drivers? Can i see your draw code? That may give you correct answer.It may also be your shader, as i could guess, if you are writing from fixed fuction to shader then your hlsl and general shader usage knowledge probably is not of high levels, so you could make some small mistake or skip some very obvious optimization here and there and that could give you that performance difference versus fixed function.

Share this post


Link to post
Share on other sites
SKGenius    122
Things are never quite as they appear it seems. Direct3D must do a lot more in the background than I would have previously thought.

I have, 77 vertex+index buffers for 77 different models on the screen, each buffer is subdivided into about 3-5 different groups which have different material properties. They where grouped together by the buffers to reduce switching between them (which from what I gathered on t'internet and other quick tests should be best), but this meant that the material properties needed to be changed more frequently.

I grouped the 567 subdivisions together by material instead, and out of curiosity made sure that the same vertex buffer wasn't used twice in a row, so with 567 changes to the current vertex and index streams but only 9 or so changes to the material the the shader performed almost identically to the fixed function pipeline.

The difference between the two groupings was negligible through the FFP from the 1100 fps I mentioned above to 1180-1200 fps (0.07ms difference), but the difference when using shaders is like a smack in the face with a wet fish.

EDIT: When changing effect parameters mid-pass you essentially have to stop the effect make the changes and then restart the effect. Calling CommitChanges is almost the same cost as calling EndPass and the BeginPass.
Whenever you change shaders they are validated on the device before they are actually set, by changing the parameters you may be indirectly causing your shader to be validated again. Direct3D may be able to bypass the validation for itself so it doesn't require this additional cost when changing parameters.

This is my current working theory as to the difference between shaders and the FFP in this case. Edited by SKGenius

Share this post


Link to post
Share on other sites
Washu    7829
[quote name='SKGenius' timestamp='1300306091' post='4786726']
Things are never quite as they appear it seems. Direct3D must do a lot more in the background than I would have previously thought.

I have, 77 vertex+index buffers for 77 different models on the screen, each buffer is subdivided into about 3-5 different groups which have different material properties. They where grouped together by the buffers to reduce switching between them (which from what I gathered on t'internet and other quick tests should be best), but this meant that the material properties needed to be changed more frequently.

I grouped the 567 subdivisions together by material instead, and out of curiosity made sure that the same vertex buffer wasn't used twice in a row, so with 567 changes to the current vertex and index streams but only 9 or so changes to the material the the shader performed almost identically to the fixed function pipeline.

The difference between the two groupings was negligible through the FFP from the 1100 fps I mentioned above to 1180-1200 fps (0.07ms difference), but the difference when using shaders is like a smack in the face with a wet fish.

EDIT: When changing effect parameters mid-pass you essentially have to stop the effect make the changes and then restart the effect. Calling CommitChanges is almost the same cost as calling EndPass and the BeginPass.
Whenever you change shaders they are validated on the device before they are actually set, by changing the parameters you may be indirectly causing your shader to be validated again. Direct3D may be able to bypass the validation for itself so it doesn't require this additional cost when changing parameters.

This is my current working theory as to the difference between shaders and the FFP in this case.
[/quote]

Changing shader parameters typically requires a pipeline flush on the GPU. That's slow. As does changing the shader its self. With the fixed function pipeline there is rarely a need for a "shader constant" change or the like, because the FFP shader is one shader with most information supplied up front. Tom had a great [url="http://tomsdxfaq.blogspot.com/"]DX FAQ[/url], it's a bit behind the times now, however his suggested batching order is still quite accurate:


[font="Verdana, Arial, sans-serif"]-Pixel shader changes.
-Pixel shader constant changes.
-Vertex shader changes.
-Vertex shader constant changes.
-Render target changes.
-Vertex format changes (SetVertexDeclaration).
-Sampler state changes.
-Vertex and index buffer changes (without changing the format).
-Texture changes.
-Misc. render state changes (alpha-blend mode, etc)
-DrawPrim calls.[/font]

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this