HLSL - can it match FFP speedwise?

Started by
6 comments, last by Sneftel 18 years, 4 months ago
I am working on my shaders atm, and after some tests, I noticed a significant dip in performance from my shader to the Fixed-Function-Pipeline (FFP). I lost around 15-20 FPS (from 192 to 175). Even with the most basic shader possible, that just passed on the vertex in the VS and made the pixel black in the PS, I couldn't reach the FFP. I even tried to run the FFP at the highest sampling (Anisotropic) and running the shader with the lowest (Linear), and still the closest I can ever get no matter what I try, is the shader is 10 slower than the FFP. Am I doing something wrong? Can I specify some flags to optimize the compilation of the shader? (I have Flags in the D3DXCreateEffectFromFile() function set to 0, since I don't see any optimization flags). Can you match or beat the FFP? And how do you do it?
Quote:CalvinI am only polite because I don't know enough foul languageQuote:Original post by superpigI think the reason your rating has dropped so much, Mercenarey, is that you come across as an arrogant asshole.
Advertisement

If your shaders are trivial, then you're going to be bottlenecked by graphics driver overhead, not shader processing. What you're seeing indicates that the driver is simply better optimized in the fixed-function path than the shader path. If you were running more expensive shaders, you probably wouldn't notice any difference.

All of the modern GPUs now (e.g. ATI Radeon 9xx and up) convert fixed-function commands into shaders in the driver anyway.

JB
Joshua Barczak3D Application Research GroupAMD
Remember that 192 to 175 is a differences of 500 microseconds (~9%), it might not be that significant. That is easily in the realms of timer (in)accuracy.

The second thing to appreciate is that the general style of programming differs with shaders; given you have more control over what you can/can't do there are many more possible combinations. You might want to look at how your application is using shaders to see if you can tune the app rather than look at just the HLSL.

Also, it's generally agreed (although fact is hard to find) that the IHV's have been implementing the FFP via shaders "under the covers" for a while now.

hth
Jack

<hr align="left" width="25%" />
Jack Hoxley <small>[</small><small> Forum FAQ | Revised FAQ | MVP Profile | Developer Journal ]</small>

jbarcz1:
Your explanation sounds very plausible. I tested some more, and discovered that a big part of the difference lay in the setting of the effect.

And my shader IS very simple for now (very basic, moves the vector to projection space, and uses one light (directional with only diffuse) only.

So I am content with your explanation for now - at least until I make more advanced shaders I can test with, hehe.

jollyjeffers:
I don't buy the timer inaccuracy should be the explanation. I tested many times back and forth with FFP and shader, and the result was more or less the same all the time.
If it was an inaccuracy, why wouldn't some of the tests fall out to the shader's advantage?
Furthermore, Im using the performance timer which is very accurate.
Quote:CalvinI am only polite because I don't know enough foul languageQuote:Original post by superpigI think the reason your rating has dropped so much, Mercenarey, is that you come across as an arrogant asshole.
Older cards had dedicated fixed function hardware along side the programmable shader hardware (the GeForce 3 and 4, GeForce FX and Radeon 8500) and were often faster when using the fixed function pipe than when running a shader of similar complexity. Newer cards like the Radeon 9000 series and up and the GeForce 6 series and up have no fixed function hardware and convert fixed function calls to shaders internally so generally shaders should be just as efficent and usually more efficient since they don't need to handle all the different fixed function options in many cases.

The problem here is more likely to be in the way you are measuring performance though as pointed out. In order to be sure you're measuring mostly GPU performance and not API or driver overhead you should be using a moderately complex shader with a large amount of geometry in a single draw call.

Game Programming Blog: www.mattnewport.com/blog

Quote:Original post by Mercenarey
I am working on my shaders atm, and after some tests, I noticed a significant dip in performance from my shader to the Fixed-Function-Pipeline (FFP). I lost around 15-20 FPS (from 192 to 175).

Even with the most basic shader possible, that just passed on the vertex in the VS and made the pixel black in the PS, I couldn't reach the FFP.

I even tried to run the FFP at the highest sampling (Anisotropic) and running the shader with the lowest (Linear), and still the closest I can ever get no matter what I try, is the shader is 10 slower than the FFP.

Am I doing something wrong? Can I specify some flags to optimize the compilation of the shader? (I have Flags in the D3DXCreateEffectFromFile() function set to 0, since I don't see any optimization flags).

Can you match or beat the FFP? And how do you do it?


This really depends on the part. All modern video cards don't actually have a fixed function unit - they are really using shaders under the covers. Now, there are a few hardwarish features that exist that aren't exposed in shaders to make this a bit faster. They tend to do things like turn down precision and other tricks. More or less, however, its the same.

EvilDecl81
Just something nobody mentioned, when using shader, you have a small (sometime big if you change shaders too often) overhead caused by setting all the constants, and shaders themselves. Whereas with the FFP, you don't have that.
And even with newer cards where FFP is converted into shaders, it's done by the driver (I assume) and so it's probably a lot more optimized than setting your own shaders.
"I lost around 15-20 FPS" is a meaningless statistic, because frames per second is not an important measurement for performance. What is an important measurement for performance is time per frame. And as JJ pointed out, 500 microseconds is an insignificant performance difference. For comparison, an extra 500 microseconds would change a framerate of 60 FPS to a framerate of 58 FPS.

This topic is closed to new replies.

Advertisement