Archived

This topic is now archived and is closed to further replies.

Performance ? Cpu chugs but not app

This topic is 4943 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Greetings again dx gurus! I have come across some optimizations which have me totally stumped: Whenever I have 4 - 5 renders which render a lot of screen area ( 800x600pixels 4-5 times per frame) the computer chugs REALLY badly. So badly that winamp chugs, and launching IE takes 3-4 times longer than usual... However, the game still manages to output at 100+ frames per second and doesn''t show any stutters. My sprite''s render call is pretty basic (which I doubt is the culprit)
//**********************************************************************************************

// Renders a graphic straight to the screen 

//**********************************************************************************************

void Sprite::renderOntoScreen( LPDIRECT3DDEVICE8* m_pd3dDevice, float dwXOrigin, float dwYOrigin, float dwWidth, float dwHeight )
{  
	//*******************************************************

	// Define each vertex color

	//*******************************************************

	DWORD dwColor[NUM_VERTICES];
	for ( int i = 0; i< NUM_VERTICES; i++ )
	{
		dwColor[i] = D3DCOLOR_ARGB( (int)m_cColor[i].m_nAlpha, (int)m_cColor[i].m_nRed, (int)m_cColor[i].m_nGreen, (int)m_cColor[i].m_nBlue );
	}

	//*******************************************************

	// Get the level desc so we can get info on the texture

	//*******************************************************

	D3DSURFACE_DESC desc;
	m_tManager->getTexture( m_strTextures[m_nCurTexture] )->GetLevelDesc(0, &desc);

	//*******************************************************

	// UV Coordinates for the top left point

	//*******************************************************

	m_Vertices[UPPER_LEFT].vPosition = D3DXVECTOR4( dwXOrigin - 0.5f, dwYOrigin - 0.5f, 0, 1  );  
	m_Vertices[UPPER_LEFT].dwDiffuse = dwColor[UPPER_LEFT];
	m_Vertices[UPPER_LEFT].tu = 0.0;  
	m_Vertices[UPPER_LEFT].tv = 0.0;  
	
	//*******************************************************

	// UV Coordinates for the bottom left

	//*******************************************************

	m_Vertices[BOTTOM_LEFT].vPosition = D3DXVECTOR4( dwXOrigin - 0.5f, dwHeight - 0.5f,  0, 1 );  
	m_Vertices[BOTTOM_LEFT].dwDiffuse = dwColor[BOTTOM_LEFT];  
	m_Vertices[BOTTOM_LEFT].tu = 0.0; 
	m_Vertices[BOTTOM_LEFT].tv = (float)m_vOrigSize.y/(float)desc.Height; 
	
	//*******************************************************

	// UV Coordinates for the top right

	//*******************************************************

	m_Vertices[UPPER_RIGHT].vPosition = D3DXVECTOR4( dwWidth - 0.5f, dwYOrigin - 0.5f,  0, 1 );  
	m_Vertices[UPPER_RIGHT].dwDiffuse = dwColor[UPPER_RIGHT];  
	m_Vertices[UPPER_RIGHT].tu = (float)m_vOrigSize.x/(float)desc.Width;
	m_Vertices[UPPER_RIGHT].tv = 0.0;
	
	//*******************************************************

	// UV Coordinates for the bottom right

	//*******************************************************

	m_Vertices[BOTTOM_RIGHT].vPosition= D3DXVECTOR4( dwWidth - 0.5f, dwHeight - 0.5f,  0, 1 );  
	m_Vertices[BOTTOM_RIGHT].dwDiffuse = dwColor[BOTTOM_RIGHT]; 
	m_Vertices[BOTTOM_RIGHT].tu = (float)m_vOrigSize.x/(float)desc.Width;
	m_Vertices[BOTTOM_RIGHT].tv = (float)m_vOrigSize.y/(float)desc.Height;

	//*******************************************************

	// Check to see if a clipper has been set

	//*******************************************************

	if( m_rClip.top != -1 && m_rClip.bottom != -1 )
	{
		m_Vertices[UPPER_LEFT].tu = m_rClip.left/(float)desc.Width;  
		m_Vertices[UPPER_LEFT].tv = m_rClip.top/(float)desc.Height;  
		m_Vertices[BOTTOM_LEFT].tu = m_rClip.left/(float)desc.Width; 
		m_Vertices[BOTTOM_LEFT].tv = m_rClip.bottom/(float)desc.Height; 
		m_Vertices[UPPER_RIGHT].tu = m_rClip.right/(float)desc.Width;
		m_Vertices[UPPER_RIGHT].tv = m_rClip.top/(float)desc.Height;
		m_Vertices[BOTTOM_RIGHT].tu = m_rClip.right/(float)desc.Width;
		m_Vertices[BOTTOM_RIGHT].tv = m_rClip.bottom/(float)desc.Height;
	}

    //*******************************************************

    // Set render states, make sure Zbuffer is turned off,

    // this paste to the screen will create a 0,0 which will

    // fail lots of graphics if it is turned on

    //*******************************************************

    (*m_pd3dDevice)->SetRenderState( D3DRS_ZENABLE,                 m_bUseZ );
    (*m_pd3dDevice)->SetRenderState( D3DRS_ZWRITEENABLE,            FALSE );
    (*m_pd3dDevice)->SetRenderState( D3DRS_FOGENABLE,               FALSE );
	(*m_pd3dDevice)->SetRenderState( D3DRS_LIGHTING,                FALSE );
	(*m_pd3dDevice)->SetRenderState( D3DRS_ALPHABLENDENABLE,        TRUE );
    (*m_pd3dDevice)->SetRenderState( D3DRS_CULLMODE,                D3DCULL_NONE );

    //*******************************************************

    // Determine which blending technique to use...

    //*******************************************************

	(*m_pd3dDevice)->SetRenderState( D3DRS_SRCBLEND,	m_nSrcBlend );
	(*m_pd3dDevice)->SetRenderState( D3DRS_DESTBLEND,	m_nDestBlend );

    //*******************************************************

    // Mip Mapping Settings

    //*******************************************************

	(*m_pd3dDevice)->SetTextureStageState( 0, D3DTSS_MIPFILTER, D3DTEXF_NONE );
    (*m_pd3dDevice)->SetTextureStageState( 0, D3DTSS_MINFILTER, D3DTEXF_POINT );
    (*m_pd3dDevice)->SetTextureStageState( 0, D3DTSS_MAGFILTER, D3DTEXF_POINT );
    
	//*******************************************************

    // Choose which color operations to use

    //*******************************************************

	(*m_pd3dDevice)->SetTextureStageState( 0, D3DTSS_COLOROP,   m_nColorOp );
	(*m_pd3dDevice)->SetTextureStageState( 0, D3DTSS_COLORARG1, D3DTA_TEXTURE );
	(*m_pd3dDevice)->SetTextureStageState( 0, D3DTSS_COLORARG2, D3DTA_DIFFUSE );

	//*******************************************************

    // Choose which alpha operations to use

    //*******************************************************

	(*m_pd3dDevice)->SetTextureStageState( 0, D3DTSS_ALPHAOP,   m_nAlphaOp );
    (*m_pd3dDevice)->SetTextureStageState( 0, D3DTSS_ALPHAARG1, D3DTA_TEXTURE );
    (*m_pd3dDevice)->SetTextureStageState( 0, D3DTSS_ALPHAARG2, D3DTA_DIFFUSE );

	//*******************************************************

	// Draw the thing

	//*******************************************************

	(*m_pd3dDevice)->SetVertexShader( D3DFVF_TLVERTEX ); 
	(*m_pd3dDevice)->SetTexture( 0, m_tManager->getTexture( m_strTextures[m_nCurTexture] ) );
	(*m_pd3dDevice)->DrawPrimitiveUP( D3DPT_TRIANGLESTRIP, 2, m_Vertices, sizeof(TLVertex2) );  
}

Not doing anything else really. Loading time has several threads but past that the app runs on a single thread. No multisampling or anti aliasing enabled. The device type is a D3DDEVTYPE_HAL and the device was made with D3DCREATE_HARDWARE_VERTEXPROCESSING and checks out as a Pure device. Resources are created with D3DPOOL_DEFAULT and all textures loaded come out to about ~2.6 megs. Is there a way to figure out the app priority set and maybe let directx know the app doesn''t need 100% of the cpu power?

Share this post


Link to post
Share on other sites
are you sure you are fill rate bound? when you decrease the resolution, does the performance increase or not?
and why are you using DrawPrimitiveUP() and not DrawPrimitive() with a vertex buffer?

Share this post


Link to post
Share on other sites
quote:
Original post by mohamed adel
are you sure you are fill rate bound? when you decrease the resolution, does the performance increase or not?
and why are you using DrawPrimitiveUP() and not DrawPrimitive() with a vertex buffer?


I am not sure how that relates to other programs being bogged down? My app is running as smooth as could be, getting between 100 fps on my geforce 2 to 600 fps on a geforce 4... As you point out though, I could get MORE performance out of the renderer but I just don''t need it...

Winamp, ie explorer, etc... chugs majorly as it is...


Share this post


Link to post
Share on other sites
Im pritty sure is because of the crappy way windows shares the cpu to other tasks, because your app is needing a fair bit of cpu power and maybe bandwidth it still runs fast but dosnt allow the other apps to use enough cpu time to operate correctly

Share this post


Link to post
Share on other sites
A few guesses (the code above doesn''t seem to have any obvious issues - although there isn''t enough detail to do any more than guess):


1) Your message pump/loop is of a "greedy" design actually _intended_ to take as much available CPU time as possible. An example would be calling PeekMessage(..) with PM_REMOVE even when you want to play nicely with other processes. A call to the blocking GetMessage(..) for example wouldn''t be as greedy but would mean less CPU time dedicated to your process.


2) A driver or setting of your system is at fault. I''ve seen similar things with a certain combination of sound card, graphics card and motherboard driver - each was trying to claw back as many resources from the system as it could to try and improve percieved performance and do better on benchmark scores.


3) Related to the above, if you''re sending massive textures across the AGP bus every frame (for example), then that could be a big drain on bus clock cycles (particularly on boards sharing PCI).


4) Have you been playing with any process or thread priorities? or have you allocated tons of system memory elsewhere in your app?


Simon O''Connor
Game Programmer &
Microsoft DirectX MVP

Share this post


Link to post
Share on other sites
I was trying to know wether you are fill rate bound (wich is the Ideal case in a small demo with no sound or AI or such stuff) or you are bound by anything else (processor, bus etc).it seems that you are fillrate bound.
you can decrease the load on the CPU by using DrawPrimitive() instead of DrawPrimitveUP(), but I think there''s something other than the rendering loop that is taking all this CPU load.I can play winamp while opening some heavy games (ofcourse the peroformance decreases slightly, but not very much as your case).
what about trying another driver as Simon pointed?

Share this post


Link to post
Share on other sites
1) Your message pump/loop is of a "greedy" design actually _intended_ to take as much available CPU time as possible

I am not sure about this. I wouldn''t have suspected it but you may be right. I use peek message but only when the app is active... It has a pretty typical message pump ( see below )...

Ideally it would be nice to let windows know that I only need X% of the available resources... This would explain over all chugging, but in general the computer only chugs when I render large portions per frame... For example, I call my sprite''s render function with an 800x600 sized texture 6/7 times per frame...

Why would windows chug if the graphics card is put to work?

I am pushing 4 vertices per sprite per render. The chugging occurs with as few as 8 sprites... Every frame, this comes out to 32 vertices and 8 DrawPrimitiveUP calls...



while( WM_QUIT != msg.message )
{
if( m_gActive )
bGotMessage = PeekMessage( &msg, NULL, 0U, 0U, PM_REMOVE );
else
bGotMessage = GetMessage( &msg, NULL, 0U, 0U );

if( bGotMessage )
{
TranslateMessage( &msg );
DispatchMessage( &msg );
}
else
{
if( m_gActive )
{
m_pEngine->heartBeat();
m_pEngine->render();
}
}
}


2) A driver or setting of your system is at fault.

I tested on a few machines, the ironic part is, the faster computers chugged as well. The app just hogged that many more resources

3) Related to the above, if you''re sending massive textures across the AGP bus every frame (for example), then that could be a big drain on bus clock cycles (particularly on boards sharing PCI).

I only have about 2 megs worth of textures. I could get the bug to occur with a single 20kb jpeg at 800x600 pixels by rendering it 10 times every frame though.



Hmmmmm, I am sure this matters, but my app is running in windowed mode. I just tested the dolphin tweening example and resized the window to about 800 by 600 ( my screen size ) and sure enough their example chugs win amp as well...

Anyways, I appreciate your prompt responses !! Very helpful in learning how to deal with these things

Much to learn I still have!





Share this post


Link to post
Share on other sites
It seems like this is a pretty popular issue

Windowed mode games act far differently with the message pump than I expected. I got my app up to a respectable level, but if I ever come across a situation where a frame took longer than 20 ms it will just run at full speed...


This seems like a hack but it works...


I would still appreciate any other further advice those have to offer ....



UINT uiResult = timeBeginPeriod(1);
while( 1 )
{
DWORD tStart = timeGetTime();

if( PeekMessage(&msg, NULL, 0, 0,PM_REMOVE) )
{
if( msg.message == WM_QUIT )
{
break;
}

TranslateMessage(&msg);
DispatchMessage(&msg);
}

if( m_bActive && m_bReady )
{
if( FAILED( m_pEngine->heartBeat() ) )
SendMessage( m_hWnd, WM_CLOSE, 0, 0 );
}

DWORD tLeft = (timeGetTime() - tStart);
if( tLeft < 20 )
{
Sleep( 20 - tLeft );
}

}

uiResult = timeEndPeriod( 5 );




[edited by - Rhapsodus on June 2, 2004 3:24:52 AM]

Share this post


Link to post
Share on other sites