How can ONE triangle cause the fps to drop by half (150-75) ?

Started by
13 comments, last by VladR 20 years, 10 months ago
Hi, While optimizing my game to run on different gfx cards, I have noticed a really weird performance drop when I plugged into my rig nVidia TNT1 8 MB card. After switching off almost everything I was left off only with floor that is being rendered like NonIndexed triangle List from one Vertex Buffer. Without floor, the fps is 150, with floor (264 triangles) it falls down to 75 fps. After experimenting with different triangle counts I found the threshold: It is triangle Number 36. If there are just 36 or less triangles, fps is 150 fps, if there are 37 or more, the fps falls to 75 fps - what is also strange is that it is actually the half! It shouldn`t be a fillrate problem since the fact whether i set the camera to show all 36 triangles or just half of them doesn`t change the fps at all (even not for 0.1 fps). But when I add the 37th, it falls to 75 fps (from 150) - i.e. exactly down to half. Here is the render code:
        
	fm_pd3dDevice->SetStreamSource( 0, BigFloorVB_Handle, sizeof(FLOORCUSTOMVERTEX) );
	fm_pd3dDevice->SetVertexShader( D3DFVF_FLOORCUSTOMVERTEX );
	int rows = 4;	
	int cols = 4;
	CGrid * G = &Grid [0];
	for (int cRow = 0; cRow < rows; cRow++)
			fm_pd3dDevice->DrawPrimitive( D3DPT_TRIANGLELIST, cRow*G->Dimension_Columns*2*3, cols*2); 
	fm_pd3dDevice->DrawPrimitive( D3DPT_TRIANGLELIST, rows*G->Dimension_Columns*2*3, 2*2+1); 
        
As you can see, there are no state changes between DrawPrimitive calls. Vertex Buffer for floor is locked just once during startup with following flags: D3DUSAGE_WRITEONLY, D3DFVF_FLOORCUSTOMVERTEX, D3DPOOL_DEFAULT and never locked again during rendering. Also please note that whether I draw the triangles with one call to DrawPrimitive (it is in current version, since it is faster to send ~200 triangles to render than to play with their visibility) or with several of them (which is just for debug purposes of course), it doesn`t simply make any difference. I also can`t do it in Windowed mode (which would help in debugging - but unfortunately DirectX Debug Window didn`t show anything when rendering that last triangle ), it shows only in FullScreen Mode. I will of course use Indexed triangle lists (or strips in case of floor) in final build. But what on Earth causes that halving ? And I just noticed that this 150 fps is exactly my monitor refresh rate. Why does it ignore my DirectX settings of VSync (off) in Display Properties and aTuner ? I haven`t noticed any variable for window creation related to VSync. (Though I`m gonna search these boards for Refresh stuff - it gets asked a lot I think.) BTW, the rig is : AMD XP2100, 256 MB, gfx (TNT1-GF2MX, GF2GTS,GF3,GF4MX), Debug DX 8.1, Dets 30.00 EDIT : If I render 1600 additional triangles of my low-poly characters, the fps stays still at those magic 75 fps (as is the fps with only 37 floor triangles). EDIT2: I plugged GF2MX this time to see what is the situation there. She can naturally render much higher amount of tris with no problems at all (compared to TNT1). Now GF2MX renders whole floor at full 150 fps (gotta search the docs for those presentation parameters - thanks Anon ). What is however strange is that it stops at round fps : Either 150.0, or 75.0 or 50.0 fps. There is obviously some correlation between fps and VSync not depending on amount of tris - because I get here those 75.0 fps whether I render 10.000 tris or 4.000 tris per frame (it changes from one floor quad walking onto another) - still on GF2MX. Drivers seem to play it their way here not caring for 10k or 5k polys ... Any idea is appreciated... VladR Avenger game [edited by - VladR on May 26, 2003 4:59:44 PM]

VladR My 3rd person action RPG on GreenLight: http://steamcommunity.com/sharedfiles/filedetails/?id=92951596

Advertisement
The variable controlling if you have vsync is in your D3D presentation parameters. It''s called FullScreenPresentationInterval. You want to use ''immediate'' rather than ''one''.

Too lazy to look up an exact block of code for you. Hope it helps.
quote:Too lazy to look up an exact block of code for you. Hope it helps.
Sure it helped a lot. Spoonfeding always helps
Now I can watch my GF2MX range from 50 fps to 273 fps which is sort of funny. I wonder what it shall be like with TNT1 (shall try this evening, but now I gotta go sleep since I`m getting up in few hours to daily work).


VladR
Avenger game

VladR My 3rd person action RPG on GreenLight: http://steamcommunity.com/sharedfiles/filedetails/?id=92951596

It's not as big a performance hit as you would think, check out this article for the reason why:http://www.mvps.org/directx/articles/fps_versus_frame_time.htm

[edited by - jamessharpe on May 26, 2003 6:39:54 PM]
Now that was a link ! I was reading it and slapping myself all the time. How come I didn`t realize this issue sooner ?

Unfortunately this article can`t explain, why do I have such a difference in a FRAME TIME when rendering the very one 37th triangle.

I still think it has to be something in drivers, some batching stuff - i.e. no matter how many triangles within a given boundary I render, that particular frame shall still take for example 1 ms (with 37 OR even 50 tris rendered).

EDIT : That Frame Time Linearity could explain why am I getting exactly 75.0 or 50.0 or 25.0 fps, right ?

Do the drivers work that way ? Or am I asking the wrong question ?

VladR
Avenger game

[edited by - VladR on May 26, 2003 7:12:12 PM]

VladR My 3rd person action RPG on GreenLight: http://steamcommunity.com/sharedfiles/filedetails/?id=92951596

FILLRATE. Drawing a huge tri on an old card is a big issue.

Brian J
Brian J
when computing FRAME TIMES you must compute the time BEFORE you call the flip, because once you include the time waiting to flip, then of course both FPS and FRAME TIME can only go in intervals of the refreash rate ... ie 150, 75, 52, 37, 25 ...

If you compute the time before the flip, then the diff between 36 and 37 should be reasonable ... please check and post the numbers if they are not ... as then there is a real difference besides just "it crossed the threashold" ...
quote:FILLRATE. Drawing a huge tri on an old card is a big issue.
No, it`s not in fillrate. And those tris are not huge, all those triangles have exactly same size and from gameplay camera position you usually see 130 triangles AND the fps stays the same (75 fps) whether half of screen is black and the other half full of tris or just 37 tris. So fillrate is not reason.
quote:when computing FRAME TIMES you must compute the time BEFORE you call the flip, because once you include the time waiting to flip, then of course both FPS and FRAME TIME can only go in intervals of the refreash rate ... ie 150, 75, 52, 37, 25 ...
Hm, could you please explain this a little bit more ? What does it have to do with round fps numbers (with VSync finally OFF in DirectX) which is either 150.0 or 75.0 (nothing between them - i.e. no 123,45 fps or such). And it probably has to do something with the HW since when I plugged GF2MX (after my last night`s post) I got fps in range from 40.0 - 300.00 and it was anywhere in that boundary not just clean 150 or 75 or 50.
I still find it hard to believe that TNT1 HW worked that way. Did anybody notice same issue when testing their game on different HW?




VladR
Avenger game

VladR My 3rd person action RPG on GreenLight: http://steamcommunity.com/sharedfiles/filedetails/?id=92951596

This may well be a driver limitation, but if you think about it, there will always be some minor addition to the loop that causes the frame time to exceed the retrace time, and suddenly the FPS is halved (assuming vsync is on).

This addition may not even be graphic related code - there may be one-too-many math operations in the loop that the addition of just one more operation could mean the retrace has been missed.

Okay, a slightly exaggerated example, but you get the idea!

<a href="http://www.purplenose.com>purplenose.com
quote:when computing FRAME TIMES you must compute the time BEFORE you call the flip, because once you include the time waiting to flip, then of course both FPS and FRAME TIME can only go in intervals of the refreash rate ... ie 150, 75, 52, 37, 25 ...

If you compute the time before the flip, then the diff between 36 and 37 should be reasonable ... please check and post the numbers if they are not ... as then there is a real difference besides just "it crossed the threashold" ...


In most cases, profiling individual API calls is not useful. The CPU and GPU operate in parallel and DrawPrimitive will return well before the triangles are actually rendered by the hardware. Typically you want to profile the entire frame time, and over the course of many frames (because the driver is allowed to queue up as many as 3 frames of rendering commands).

As another poster said, use D3DPRESENT_INTERVAL_IMMEDIATE to take the refresh rate out of the equation.
Donavon KeithleyNo, Inky Death Vole!

This topic is closed to new replies.

Advertisement