Sign in to follow this  
MSeithe

Direct3D timing/speed problem

Recommended Posts

MSeithe    122
Hi, I've got the problem that my program runs slower, the more things I do to make it run faster. It sound silly and is slowly making me insane as I have no idea what it is caused by. When compiling/linking against the Debug-versions of DirectX and using the debug runtimes I get framerates of about 50fps. As soon I switch either one (or both) to retail, the framerate drops to 15 fps. It gets even worse if I reduce the amount of trianles rendered, in which case it decreases to about 6-7 fps. Furthermore graphics/rendering errors seem to occur like textures being omitted or misplaced vertices. I guess it might be some sort of timing issue caused by the rendering code being executed too fast and causing the system to wait, however I couldn't find anything appropriate on the web. Thank you very much in advance for your help!! Init code:
#define LOG3D(a) {HRESULT hr=(a); if (hr!=D3D_OK) throw #a ;}

		g_D3D = Direct3DCreate9(D3D_SDK_VERSION);

		D3DDISPLAYMODE d3ddm;
	    g_D3D->GetAdapterDisplayMode( D3DADAPTER_DEFAULT, &d3ddm );
		sx=d3ddm.Width;
		sy=d3ddm.Height;

		D3DPRESENT_PARAMETERS d3dpp; 
		ZeroMemory( &d3dpp, sizeof(d3dpp) );
		d3dpp.Windowed   = TRUE;
//		d3dpp.BackBufferCount=1;
		d3dpp.BackBufferFormat=d3ddm.Format;
//		d3dpp.BackBufferHeight=d3ddm.Height;
//		d3dpp.BackBufferWidth=d3ddm.Width;
		//d3dpp.FullScreen_RefreshRateInHz=d3ddm.RefreshRate;
		d3dpp.AutoDepthStencilFormat = D3DFMT_D16;
		d3dpp.PresentationInterval=D3DPRESENT_INTERVAL_IMMEDIATE;
		d3dpp.SwapEffect = D3DSWAPEFFECT_DISCARD;


		LOG3D(g_D3D->CreateDevice( D3DADAPTER_DEFAULT, D3DDEVTYPE_HAL, hwnd,
			D3DCREATE_SOFTWARE_VERTEXPROCESSING,&d3dpp, &g_D3DDEVICE )); 


		// D3D-Eigenschaften festlegen

		// Klein Culling
		LOG3D(g_D3DDEVICE->SetRenderState(D3DRS_CULLMODE,D3DCULL_NONE));

		// Füllmodus setzen
		LOG3D(g_D3DDEVICE->SetRenderState(D3DRS_FILLMODE,D3DFILL_SOLID));

		// Textur-Filterung und Antialiasing
		//LOG3D(g_D3DDEVICE->SetRenderState(D3DRS_ANTIALIASEDLINEENABLE,TRUE));
		LOG3D(g_D3DDEVICE->SetSamplerState(0, D3DSAMP_MINFILTER, D3DTEXF_LINEAR));
		LOG3D(g_D3DDEVICE->SetSamplerState(0, D3DSAMP_MAGFILTER, D3DTEXF_LINEAR));
		LOG3D(g_D3DDEVICE->SetSamplerState(0, D3DSAMP_MIPFILTER, D3DTEXF_LINEAR));

		// Vertex-Format festlegen
		LOG3D(g_D3DDEVICE->SetFVF(D3DFVF_XYZRHW | D3DFVF_DIFFUSE | D3DFVF_SPECULAR |                           D3DFVF_TEX1 ));

		// Alpha-Blending aktivieren
		LOG3D(g_D3DDEVICE->SetRenderState(D3DRS_ALPHABLENDENABLE,TRUE));
		LOG3D(g_D3DDEVICE->SetRenderState(D3DRS_SRCBLEND,D3DBLEND_SRCALPHA));
		LOG3D(g_D3DDEVICE->SetRenderState(D3DRS_DESTBLEND,D3DBLEND_INVSRCALPHA));


		LPDIRECT3DSURFACE9 pBackBuffer = NULL;
		D3DSURFACE_DESC d3dsd;
		g_D3DDEVICE->GetBackBuffer( 0, 0, D3DBACKBUFFER_TYPE_MONO, &pBackBuffer );
		pBackBuffer->GetDesc( &d3dsd );
		pBackBuffer->Release();
		sx  = d3dsd.Width;
		sy= d3dsd.Height;

Main loop:

        if( PeekMessage( &msg, NULL, 0, 0, PM_REMOVE ) )
        {
            // Check for a quit message
            if( msg.message == WM_QUIT ) break;
            TranslateMessage( &msg );
            DispatchMessage( &msg );
        }
        else 
        {
			switch (status)
			{
			case STATUS_REINIT:
				status=STATUS_ACTIVE;
				break;
			case STATUS_ACTIVE:
				{

				//	Sleep(0);
					c++;
					if (c%30==0) LOGINT(timer.getLastSecFrames());
					float dt=timer.update();
					LOGTRY(dikb.update();)
					d3df.setCallback((IDirect3DDrawCallback**)state);
					
					if (state!=NULL) 
					{
						LOGTRY((*state)->parsekeyboard(&dikb, dt);)
						LOGTRY((*state)->processTime(dt);)
					
					}	

					LOGTRY(d3df.repaint(););
						
				}
				break;


			case STATUS_INACTIVE:
				break;

			}
        }

painting:
	void CDirect3D::repaint()
	{
		LOG3D(g_D3DDEVICE->Clear(0, NULL, D3DCLEAR_TARGET,0x00000000, 1.0f, 0));

		LOG3D(g_D3DDEVICE->BeginScene());
		if (callback!=NULL) (*callback)->draw3D(g_Direct3DPaint);
		LOG3D(g_D3DDEVICE->EndScene());

		LOG3D(g_D3DDEVICE->Present(NULL,NULL,NULL,NULL));
	}

draw3D consists of about 300-500 calls of:
	void CDirect3D::drawTexturedVertices(const D3DTLVERTEX* vertices, IDirect3DTexture9* texture, const int count)
	{
		LOG3D(g_D3DDEVICE->SetTexture(0,texture));
		LOG3D(g_D3DDEVICE->DrawPrimitiveUP(D3DPT_TRIANGLELIST,count,vertices,sizeof(D3DTLVERTEX)));
	}	

Share this post


Link to post
Share on other sites
Demus79    362
What is the debug output of DX in debug mode ?

That might give some clue.

Edit:

#define LOG3D(a) {HRESULT hr=(a); if (hr!=D3D_OK) throw #a ;}

This line has a problem:

The correct way to check the HRESULT return value is using either FAILED(x) or
SUCCEEDED(x) macros. There are plenty of different failing and succeeding return values, not just D3D_OK;

[Edited by - Demus79 on March 17, 2006 4:40:01 AM]

Share this post


Link to post
Share on other sites
MSeithe    122
The Debug Output says:


Direct3D9: :====> ENTER: DLLMAIN(0131d6e0): Process Attach: 00000658, tid=00000b54
Direct3D9: :====> EXIT: DLLMAIN(0131d6e0): Process Attach: 00000658
Direct3D9: (INFO) :Direct3D9 Debug Runtime selected.
Direct3D9: (INFO) :======================= Hal SWVP device selected

Direct3D9: (INFO) :HalDevice Driver Style 9

Direct3D9: :BackBufferCount not specified, considered default 1
Direct3D9: :DoneExclusiveMode
Direct3D9: (INFO) :Failed to create driver indexbuffer
'Direct3D HAL (SWVP) Device 1': Attached to debug monitor.
D3D9 Helper: Warning: Default value for D3DRS_POINTSIZE_MAX is 2.19902e+012f, not 1.44115e+017f. This is ok.
Direct3D9: (WARN) :Ignoring redundant SetRenderState - 8

Der Thread 'Win32 Thread' (0x450) hat mit Code 0 (0x0) geendet.
Der Thread 'Direct3D HAL (SWVP) Device 1' (0x1) hat mit Code 0 (0x0) geendet.
Das Programm "[1624] BasicDD.exe: Direct3D HAL (SWVP) Device 1" wurde mit Code 0 (0x0) beendet.Direct3D9: :====> ENTER: DLLMAIN(0131d6e0): Process Detach 00000658, tid=00000b54
Direct3D9: (INFO) :MemFini!
Direct3D9: :====> EXIT: DLLMAIN(0131d6e0): Process Detach 00000658
Das Programm "[1624] BasicDD.exe: Systemeigen" wurde mit Code 0 (0x0) beendet.

Share this post


Link to post
Share on other sites
MSeithe    122
Quote:
Original post by Demus79

#define LOG3D(a) {HRESULT hr=(a); if (hr!=D3D_OK) throw #a ;}

This line has a problem:

The correct way to check the HRESULT return value is using either FAILED(x) or
SUCCEEDED(x) macros. There are plenty of different failing and succeeding return values, not just D3D_OK;


I'll try that, thanks for the note!!
However since no error was thrown (which would have been logged) I guess it doesn't directy affect my timing problem.

Share this post


Link to post
Share on other sites
Demus79    362

Yes, your problem sounds quite strange to me too.

Have you tried disabling some lines from your program to see if there is some faulty parts ?

What does the function (*state)->processTime(dt); do ?

Probably not related to the problem but D3DFVF_XYZRHW vertex format causes some stall with modern GPU's, although probably not with sw vertex processing. DrawPrimitiveUP also isn't the fastest method for drawing polygons.

Share this post


Link to post
Share on other sites
MSeithe    122
Quote:
Original post by Demirug
I am seeing a DrawPrimtiveUP in your code. How many vertices do you draw per call?


I'm using DrawPrimtiveUP to render rectangular graphics to the screen. Since all rectangles have (or can have) different textures, I only draw two polygons at once. I use D3DPT_TRIANGLELIST so I send 6 vertices per call.

The code I'm using is:


#define TRANST(x,y,c,u,v) {(tlx+(x)*fx),(tly-(y)*fy),1,1,(c),(c),(u),(v)}

void CDirect3DPaint::texturedRect (const float x1, const float y1, const float sx, const float sy,
const D3DCOLOR clo, const D3DCOLOR cro, const D3DCOLOR clu, const D3DCOLOR cru,
IDirect3DTexture9* texture)
{
D3DTLVERTEX d3dtlv[]={TRANST(x1-sx/2,y1-sy/2,clu,0,1),TRANST(x1+sx/2,y1-sy/2,cru,1,1),TRANST(x1-sx/2,y1+sy/2,clo,0,0),
TRANST(x1+sx/2,y1+sy/2,cro,1,0),TRANST(x1+sx/2,y1-sy/2,cru,1,1),TRANST(x1-sx/2,y1+sy/2,clo,0,0)};

d3d->drawTexturedVertices(d3dtlv, texture, sizeof(d3dtlv)/sizeof(D3DTLVERTEX));
}

(...)

void CDirect3D::drawTexturedVertices(const D3DTLVERTEX* vertices, IDirect3DTexture9* texture, const int count)
{
LOG3D(g_D3DDEVICE->SetTexture(0,texture));
LOG3D(g_D3DDEVICE->DrawPrimitiveUP(D3DPT_TRIANGLELIST,count,vertices,sizeof(D3DTLVERTEX)));
}


Share this post


Link to post
Share on other sites
Evil Steve    2017
300-500 DrawPrimitiveUP() calls is A Bad Thing. You should use a dynamic vertex buffer for things like that. You'd put all the triangles into the VB and draw in one call. Draw[Indexed]Primitive[UP]() is a very expensive call to make (relatively).
I seem to recall that 500 draws per frame is an absolute maximum you should be drawing (Someone feel free to correct me here).

However, you should still get more than 6 FPS, and it doesn't explain why it's slower in Release mode...

Share this post


Link to post
Share on other sites
MSeithe    122
Quote:
Original post by Demus79
Have you tried disabling some lines from your program to see if there is some faulty parts ?


Yes, that pretty much sums up what I did last night ;-)

I uploaded two actual screenshots:
http://totalsoft.de/temp/screen1.png (everything fine in debug-mode)
http://totalsoft.de/temp/screen2.png (erroneous textures when running in retail/release mode)


My graphics include map-graphics and player-graphics, so my first approach was to isolate them.
Drawing both: 11-13 fps
Drawing only map: 7 fps
Drawing only players: 81 fps
Empty Draw thread: ~900 fps

(So it gets faster if the players are compared to only drawing the map...)

The map consists of background and walls:
Drawing players and walls: 6-8 fps
Drawing players and bg: ~70 fps
Drawing players and walls and bg: 11-13 fps however with faulty graphics! (bg texture not drawn, see screen2.png

The walls consist of ~200 tiles, i.e. 400 polygons. I could understand if drawing them would slow me down - however it's all fine in debug-mode...



Quote:
Original post by Demus79
What does the function (*state)->gt.processTime(dt); do ?


I use this function to perform calculations for my game, like physics, processing of input and so on. "state" is defined as "SystemState** state;" so I can change the game's behaviour by just pointing state at another variable. In this case it is:


void processTime(const float dt)
{
simu_soll_zeit+=dt;

while (simu_ist_zeit<simu_soll_zeit)
{
simu_ist_zeit+=0.005f;

for (int i=0; i<player_count; i++)
player[i]->simulate(0.005f);

... (collision handling)

}
}





However, it doesn't do any graphic-related things and since I'm using only one thread I hope it shouldn't interfere.


Thanks very much for your help already Demus79! :)

Share this post


Link to post
Share on other sites
MSeithe    122
Quote:
Original post by Evil Steve
300-500 DrawPrimitiveUP() calls is A Bad Thing. You should use a dynamic vertex buffer for things like that. You'd put all the triangles into the VB and draw in one call. Draw[Indexed]Primitive[UP]() is a very expensive call to make (relatively).
I seem to recall that 500 draws per frame is an absolute maximum you should be drawing (Someone feel free to correct me here).

However, you should still get more than 6 FPS, and it doesn't explain why it's slower in Release mode...


That's definately a good point! Since I draw two polygons with one call to DrawPrimitiveUP() the amount of calls is about 200 / frame. I would like to draw them all in one buffer, however I'm not sure how I can use different textures for each polygon when I draw them all in one call without being able to call SetTexture(...) in between.

Share this post


Link to post
Share on other sites
Demirug    884
Quote:
Original post by Evil Steve
300-500 DrawPrimitiveUP() calls is A Bad Thing. You should use a dynamic vertex buffer for things like that. You'd put all the triangles into the VB and draw in one call. Draw[Indexed]Primitive[UP]() is a very expensive call to make (relatively).


Could be a problem as every call use a different texture. This will require texture atlases, too.

Quote:
I seem to recall that 500 draws per frame is an absolute maximum you should be drawing (Someone feel free to correct me here).


General Rule: 100000 calls per second on a 3GHz P4.

Quote:
However, you should still get more than 6 FPS, and it doesn't explain why it's slower in Release mode...


Maybe some of the additional work that the debug runtime does cause this.


Share this post


Link to post
Share on other sites
Demirug    884
Quote:
Original post by MSeithe
Quote:
Original post by Demirug
I am seeing a DrawPrimtiveUP in your code. How many vertices do you draw per call?


I'm using DrawPrimtiveUP to render rectangular graphics to the screen. Since all rectangles have (or can have) different textures, I only draw two polygons at once. I use D3DPT_TRIANGLELIST so I send 6 vertices per call.

The code I'm using is:

*** Source Snippet Removed ***


You send 6 vertices but only 2 primitives. DrawPrimtiveUP wants the number of primitives not the number of vertices.

Your code tells D3D to render 4 additional primitive with data that it will find behind your valid data. This must give you strange results.

Try This:

d3d->drawTexturedVertices(d3dtlv, texture, sizeof(d3dtlv)/sizeof(D3DTLVERTEX)/3);

Share this post


Link to post
Share on other sites
Evil Steve    2017
Quote:
Original post by MSeithe
Quote:
Original post by Evil Steve
300-500 DrawPrimitiveUP() calls is A Bad Thing. You should use a dynamic vertex buffer for things like that. You'd put all the triangles into the VB and draw in one call. Draw[Indexed]Primitive[UP]() is a very expensive call to make (relatively).
I seem to recall that 500 draws per frame is an absolute maximum you should be drawing (Someone feel free to correct me here).

However, you should still get more than 6 FPS, and it doesn't explain why it's slower in Release mode...


That's definately a good point! Since I draw two polygons with one call to DrawPrimitiveUP() the amount of calls is about 200 / frame. I would like to draw them all in one buffer, however I'm not sure how I can use different textures for each polygon when I draw them all in one call without being able to call SetTexture(...) in between.

Yeah, you'd need to batch your draw calls so you set a texture, draw, set next texture, draw.
However, if your textures are small (They look like they're 16x16 or something), you can compile them all onto one larger texture (256x256 or 512x512 for instance), and change the texture coordinates when you fill the vertex buffer. That way you can draw most, if not all of your triangles in one draw call.

Do you have v-sync on? It looks like drawing your map is doing something that D3D isn't at all happy about.

Demirug: Found what I was looking for: 25k batches/s @ 100% 1GHz CPU
So 25k batches/s = ~417 batches/frame at 60FPS (Which is where I got my 500 from).

Share this post


Link to post
Share on other sites
Demus79    362

Drawing only map: 7 fps

That sounds bad :), perhaps drawing too much ? Why not batching up several draw calls (ie. grouping drawing calls using the same texture for example).

Get rid of the DrawPrimitiveUP. For this batching, you can use a dynamic vertex and index buffer.


simu_ist_zeit+=0.005f;

Does that mean that you run the game updates 200 times a second ? 20-30 should be enough. Like 0.05 - 0.0333.

Share this post


Link to post
Share on other sites
Demirug    884
Quote:
Original post by Evil Steve
Demirug: Found what I was looking for: 25k batches/s @ 100% 1GHz CPU
So 25k batches/s = ~417 batches/frame at 60FPS (Which is where I got my 500 from).


Looks like we have the same source :D

My numbers are from the ChinaJoy 2004 nVidia papers.

I am looking forward how this will improve on Vista.

Share this post


Link to post
Share on other sites
MSeithe    122
Quote:
Original post by Demirug
You send 6 vertices but only 2 primitives. DrawPrimtiveUP wants the number of primitives not the number of vertices.

Your code tells D3D to render 4 additional primitive with data that it will find behind your valid data. This must give you strange results.

Try This:

d3d->drawTexturedVertices(d3dtlv, texture, sizeof(d3dtlv)/sizeof(D3DTLVERTEX)/3);




Yes yes yes!! That did it!! Problem solved, now it runs perfectly at ~330 fps :)
I always thought on such an error my program would just exit - however it seems that D3D could somehow fix it at the cost of speed - or something...

However, thank you so much for your help everyone! :)

Share this post


Link to post
Share on other sites
MSeithe    122
Quote:
Original post by Demus79

Drawing only map: 7 fps

That sounds bad :), perhaps drawing too much ? Why not batching up several draw calls (ie. grouping drawing calls using the same texture for example).

Get rid of the DrawPrimitiveUP. For this batching, you can use a dynamic vertex and index buffer.

Good idea, I will try that!


Quote:

simu_ist_zeit+=0.005f;

Does that mean that you run the game updates 200 times a second ? 20-30 should be enough. Like 0.05 - 0.0333.


This variable only refers to physics simulation, which runs at fixed precision (which is indeed 200 calculations/sec). If I don't do this it could happen that fast objects don't collide with the walls but simply go through them if it was a "long frame".

Thanks for your help! :)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this