Direct3D timing/speed problem

Started by
15 comments, last by MSeithe 18 years, 1 month ago
Hi, I've got the problem that my program runs slower, the more things I do to make it run faster. It sound silly and is slowly making me insane as I have no idea what it is caused by. When compiling/linking against the Debug-versions of DirectX and using the debug runtimes I get framerates of about 50fps. As soon I switch either one (or both) to retail, the framerate drops to 15 fps. It gets even worse if I reduce the amount of trianles rendered, in which case it decreases to about 6-7 fps. Furthermore graphics/rendering errors seem to occur like textures being omitted or misplaced vertices. I guess it might be some sort of timing issue caused by the rendering code being executed too fast and causing the system to wait, however I couldn't find anything appropriate on the web. Thank you very much in advance for your help!! Init code:

#define LOG3D(a) {HRESULT hr=(a); if (hr!=D3D_OK) throw #a ;}

		g_D3D = Direct3DCreate9(D3D_SDK_VERSION);

		D3DDISPLAYMODE d3ddm;
	    g_D3D->GetAdapterDisplayMode( D3DADAPTER_DEFAULT, &d3ddm );
		sx=d3ddm.Width;
		sy=d3ddm.Height;

		D3DPRESENT_PARAMETERS d3dpp; 
		ZeroMemory( &d3dpp, sizeof(d3dpp) );
		d3dpp.Windowed   = TRUE;
//		d3dpp.BackBufferCount=1;
		d3dpp.BackBufferFormat=d3ddm.Format;
//		d3dpp.BackBufferHeight=d3ddm.Height;
//		d3dpp.BackBufferWidth=d3ddm.Width;
		//d3dpp.FullScreen_RefreshRateInHz=d3ddm.RefreshRate;
		d3dpp.AutoDepthStencilFormat = D3DFMT_D16;
		d3dpp.PresentationInterval=D3DPRESENT_INTERVAL_IMMEDIATE;
		d3dpp.SwapEffect = D3DSWAPEFFECT_DISCARD;


		LOG3D(g_D3D->CreateDevice( D3DADAPTER_DEFAULT, D3DDEVTYPE_HAL, hwnd,
			D3DCREATE_SOFTWARE_VERTEXPROCESSING,&d3dpp, &g_D3DDEVICE )); 


		// D3D-Eigenschaften festlegen

		// Klein Culling
		LOG3D(g_D3DDEVICE->SetRenderState(D3DRS_CULLMODE,D3DCULL_NONE));

		// Füllmodus setzen
		LOG3D(g_D3DDEVICE->SetRenderState(D3DRS_FILLMODE,D3DFILL_SOLID));

		// Textur-Filterung und Antialiasing
		//LOG3D(g_D3DDEVICE->SetRenderState(D3DRS_ANTIALIASEDLINEENABLE,TRUE));
		LOG3D(g_D3DDEVICE->SetSamplerState(0, D3DSAMP_MINFILTER, D3DTEXF_LINEAR));
		LOG3D(g_D3DDEVICE->SetSamplerState(0, D3DSAMP_MAGFILTER, D3DTEXF_LINEAR));
		LOG3D(g_D3DDEVICE->SetSamplerState(0, D3DSAMP_MIPFILTER, D3DTEXF_LINEAR));

		// Vertex-Format festlegen
		LOG3D(g_D3DDEVICE->SetFVF(D3DFVF_XYZRHW | D3DFVF_DIFFUSE | D3DFVF_SPECULAR |                           D3DFVF_TEX1 ));

		// Alpha-Blending aktivieren
		LOG3D(g_D3DDEVICE->SetRenderState(D3DRS_ALPHABLENDENABLE,TRUE));
		LOG3D(g_D3DDEVICE->SetRenderState(D3DRS_SRCBLEND,D3DBLEND_SRCALPHA));
		LOG3D(g_D3DDEVICE->SetRenderState(D3DRS_DESTBLEND,D3DBLEND_INVSRCALPHA));


		LPDIRECT3DSURFACE9 pBackBuffer = NULL;
		D3DSURFACE_DESC d3dsd;
		g_D3DDEVICE->GetBackBuffer( 0, 0, D3DBACKBUFFER_TYPE_MONO, &pBackBuffer );
		pBackBuffer->GetDesc( &d3dsd );
		pBackBuffer->Release();
		sx  = d3dsd.Width;
		sy= d3dsd.Height;

Main loop:


        if( PeekMessage( &msg, NULL, 0, 0, PM_REMOVE ) )
        {
            // Check for a quit message
            if( msg.message == WM_QUIT ) break;
            TranslateMessage( &msg );
            DispatchMessage( &msg );
        }
        else 
        {
			switch (status)
			{
			case STATUS_REINIT:
				status=STATUS_ACTIVE;
				break;
			case STATUS_ACTIVE:
				{

				//	Sleep(0);
					c++;
					if (c%30==0) LOGINT(timer.getLastSecFrames());
					float dt=timer.update();
					LOGTRY(dikb.update();)
					d3df.setCallback((IDirect3DDrawCallback**)state);
					
					if (state!=NULL) 
					{
						LOGTRY((*state)->parsekeyboard(&dikb, dt);)
						LOGTRY((*state)->processTime(dt);)
					
					}	

					LOGTRY(d3df.repaint(););
						
				}
				break;


			case STATUS_INACTIVE:
				break;

			}
        }

painting:

	void CDirect3D::repaint()
	{
		LOG3D(g_D3DDEVICE->Clear(0, NULL, D3DCLEAR_TARGET,0x00000000, 1.0f, 0));

		LOG3D(g_D3DDEVICE->BeginScene());
		if (callback!=NULL) (*callback)->draw3D(g_Direct3DPaint);
		LOG3D(g_D3DDEVICE->EndScene());

		LOG3D(g_D3DDEVICE->Present(NULL,NULL,NULL,NULL));
	}

draw3D consists of about 300-500 calls of:

	void CDirect3D::drawTexturedVertices(const D3DTLVERTEX* vertices, IDirect3DTexture9* texture, const int count)
	{
		LOG3D(g_D3DDEVICE->SetTexture(0,texture));
		LOG3D(g_D3DDEVICE->DrawPrimitiveUP(D3DPT_TRIANGLELIST,count,vertices,sizeof(D3DTLVERTEX)));
	}	

Advertisement
What is the debug output of DX in debug mode ?

That might give some clue.

Edit:

#define LOG3D(a) {HRESULT hr=(a); if (hr!=D3D_OK) throw #a ;}

This line has a problem:

The correct way to check the HRESULT return value is using either FAILED(x) or
SUCCEEDED(x) macros. There are plenty of different failing and succeeding return values, not just D3D_OK;

[Edited by - Demus79 on March 17, 2006 4:40:01 AM]
The Debug Output says:

Direct3D9: :====> ENTER: DLLMAIN(0131d6e0): Process Attach: 00000658, tid=00000b54Direct3D9: :====> EXIT: DLLMAIN(0131d6e0): Process Attach: 00000658Direct3D9: (INFO) :Direct3D9 Debug Runtime selected.Direct3D9: (INFO) :======================= Hal SWVP device selectedDirect3D9: (INFO) :HalDevice Driver Style 9Direct3D9: :BackBufferCount not specified, considered default 1 Direct3D9: :DoneExclusiveModeDirect3D9: (INFO) :Failed to create driver indexbuffer'Direct3D HAL (SWVP) Device 1': Attached to debug monitor.D3D9 Helper: Warning: Default value for D3DRS_POINTSIZE_MAX is 2.19902e+012f, not 1.44115e+017f.  This is ok.Direct3D9: (WARN) :Ignoring redundant SetRenderState - 8Der Thread 'Win32 Thread' (0x450) hat mit Code 0 (0x0) geendet.Der Thread 'Direct3D HAL (SWVP) Device 1' (0x1) hat mit Code 0 (0x0) geendet.Das Programm "[1624] BasicDD.exe: Direct3D HAL (SWVP) Device 1" wurde mit Code 0 (0x0) beendet.Direct3D9: :====> ENTER: DLLMAIN(0131d6e0): Process Detach 00000658, tid=00000b54Direct3D9: (INFO) :MemFini!Direct3D9: :====> EXIT: DLLMAIN(0131d6e0): Process Detach 00000658Das Programm "[1624] BasicDD.exe: Systemeigen" wurde mit Code 0 (0x0) beendet.
Quote:Original post by Demus79

#define LOG3D(a) {HRESULT hr=(a); if (hr!=D3D_OK) throw #a ;}

This line has a problem:

The correct way to check the HRESULT return value is using either FAILED(x) or
SUCCEEDED(x) macros. There are plenty of different failing and succeeding return values, not just D3D_OK;


I'll try that, thanks for the note!!
However since no error was thrown (which would have been logged) I guess it doesn't directy affect my timing problem.
I am seeing a DrawPrimtiveUP in your code. How many vertices do you draw per call?

Yes, your problem sounds quite strange to me too.

Have you tried disabling some lines from your program to see if there is some faulty parts ?

What does the function (*state)->processTime(dt); do ?

Probably not related to the problem but D3DFVF_XYZRHW vertex format causes some stall with modern GPU's, although probably not with sw vertex processing. DrawPrimitiveUP also isn't the fastest method for drawing polygons.
Quote:Original post by Demirug
I am seeing a DrawPrimtiveUP in your code. How many vertices do you draw per call?


I'm using DrawPrimtiveUP to render rectangular graphics to the screen. Since all rectangles have (or can have) different textures, I only draw two polygons at once. I use D3DPT_TRIANGLELIST so I send 6 vertices per call.

The code I'm using is:

#define TRANST(x,y,c,u,v) {(tlx+(x)*fx),(tly-(y)*fy),1,1,(c),(c),(u),(v)}	void CDirect3DPaint::texturedRect (const float x1, const float y1, const float sx, const float sy, 			const D3DCOLOR clo, const D3DCOLOR cro, const D3DCOLOR clu, const D3DCOLOR cru, 			IDirect3DTexture9* texture)	{		D3DTLVERTEX d3dtlv[]={TRANST(x1-sx/2,y1-sy/2,clu,0,1),TRANST(x1+sx/2,y1-sy/2,cru,1,1),TRANST(x1-sx/2,y1+sy/2,clo,0,0),						      TRANST(x1+sx/2,y1+sy/2,cro,1,0),TRANST(x1+sx/2,y1-sy/2,cru,1,1),TRANST(x1-sx/2,y1+sy/2,clo,0,0)};		d3d->drawTexturedVertices(d3dtlv, texture, sizeof(d3dtlv)/sizeof(D3DTLVERTEX));	}(...)	void CDirect3D::drawTexturedVertices(const D3DTLVERTEX* vertices, IDirect3DTexture9* texture, const int count)	{		LOG3D(g_D3DDEVICE->SetTexture(0,texture));		LOG3D(g_D3DDEVICE->DrawPrimitiveUP(D3DPT_TRIANGLELIST,count,vertices,sizeof(D3DTLVERTEX)));	}	
300-500 DrawPrimitiveUP() calls is A Bad Thing. You should use a dynamic vertex buffer for things like that. You'd put all the triangles into the VB and draw in one call. Draw[Indexed]Primitive[UP]() is a very expensive call to make (relatively).
I seem to recall that 500 draws per frame is an absolute maximum you should be drawing (Someone feel free to correct me here).

However, you should still get more than 6 FPS, and it doesn't explain why it's slower in Release mode...
Quote:Original post by Demus79
Have you tried disabling some lines from your program to see if there is some faulty parts ?


Yes, that pretty much sums up what I did last night ;-)

I uploaded two actual screenshots:
http://totalsoft.de/temp/screen1.png (everything fine in debug-mode)
http://totalsoft.de/temp/screen2.png (erroneous textures when running in retail/release mode)


My graphics include map-graphics and player-graphics, so my first approach was to isolate them.
Drawing both: 11-13 fps
Drawing only map: 7 fps
Drawing only players: 81 fps
Empty Draw thread: ~900 fps

(So it gets faster if the players are compared to only drawing the map...)

The map consists of background and walls:
Drawing players and walls: 6-8 fps
Drawing players and bg: ~70 fps
Drawing players and walls and bg: 11-13 fps however with faulty graphics! (bg texture not drawn, see screen2.png

The walls consist of ~200 tiles, i.e. 400 polygons. I could understand if drawing them would slow me down - however it's all fine in debug-mode...



Quote:Original post by Demus79
What does the function (*state)->gt.processTime(dt); do ?


I use this function to perform calculations for my game, like physics, processing of input and so on. "state" is defined as "SystemState** state;" so I can change the game's behaviour by just pointing state at another variable. In this case it is:

	void processTime(const float dt)	{		simu_soll_zeit+=dt;		while (simu_ist_zeit<simu_soll_zeit)		{			simu_ist_zeit+=0.005f;			for (int i=0; i<player_count; i++) 				player->simulate(0.005f);                        ... (collision handling)		}       }


However, it doesn't do any graphic-related things and since I'm using only one thread I hope it shouldn't interfere.


Thanks very much for your help already Demus79! :)
Quote:Original post by Evil Steve
300-500 DrawPrimitiveUP() calls is A Bad Thing. You should use a dynamic vertex buffer for things like that. You'd put all the triangles into the VB and draw in one call. Draw[Indexed]Primitive[UP]() is a very expensive call to make (relatively).
I seem to recall that 500 draws per frame is an absolute maximum you should be drawing (Someone feel free to correct me here).

However, you should still get more than 6 FPS, and it doesn't explain why it's slower in Release mode...


That's definately a good point! Since I draw two polygons with one call to DrawPrimitiveUP() the amount of calls is about 200 / frame. I would like to draw them all in one buffer, however I'm not sure how I can use different textures for each polygon when I draw them all in one call without being able to call SetTexture(...) in between.

This topic is closed to new replies.

Advertisement