Sign in to follow this  

Hunting FPS

Recommended Posts

Im doing some preformance checks and have hit a irritating problem. I do one draw call with 30000 vertex and about the same nr of triangles (a landscape). I get a nice FPS 100+ if I dont have the object on screen but get a drop to less then 40FPS if i loock right at it. This shuld indicate that the pixelshader is my application bottelneck but it can't be because if I use a pixel shader that use only one instruction to forward a color I get allmost the same FPS. Whats left now is the fixt rasterizer and z-testing things that I can't control so what to do? I get better but not good FPS if I turn z-testing off. What can I be doing wrong? If I use a different objet with just a lot of random triangles in a group that can overlap the problem gets even bigger because of the overdraw. I hope someone understand my problem and have some suggestions of something to try. Im using a 6800 card c++ directx9 tnx

Share this post

Link to post
Share on other sites
It does sound like you may be fill-rate limited, though with your card, i would expect better. Can you post the respective vertex and pixel shaders?

Another thing to consider is, IIRC, DirectX will cull triangles out of the view frustrum, which could account for the speedup with less triangles. This would imply a bottleneck with vertex processing. How do you create the device?

Share this post

Link to post
Share on other sites
This is one of the shaders I use when the problem occur:

vertexInOut VS1(vertexInOut IN)
vertexInOut OUT;
OUT.pos = mul( float4( , 1.0) , matWVP);
OUT.color = IN.color;
return OUT;
float4 PS1( vertexInOut IN): COLOR
return IN.color;

as you can see they dont do mutch at all.
And with only one drawcall and little other overhead I expect to get a FPS over 100

What is IIRC?

And yes the speed goes up if the triangles are out of the viewing frustrum. I cant see how that could indicate a vertex shader bottleneck. The vertexes still have to be transformed for DX to know if thew are in the viewing frustrum or not and to do that it must run the whole VS am I right?

The device is created with:
res 1024x768
16bit z-buffer
one backbuffer
no multisample
no v-sync
hardware vertex processing

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this