Performance issues with static buffers

Started by
11 comments, last by Thomas Mathers 18 years, 9 months ago
I seem to be having trouble rendering my static with optimal performance. I am creating my static buffers with write only flags and locking them only once to store my geometry passed in during program initialization. Here is the code that I use to transfer geometry to a static buffer.

HRESULT CFNGED3D::CreateBuffer(DWORD dwFormat, int nStride, int nVertices, void* pVertices, int nIndices, void* pIndices, int nMaterialID, int* pBufferID)
{
	// First check to see if the passed in material id is valid.
	if (nMaterialID < 0 || nMaterialID > m_materials.size())
	{
		return E_INVALIDARG;
	}

	FNGE_D3D_BUFFER tempBuffer;

	tempBuffer.m_dwFormat    = dwFormat;
	tempBuffer.m_nStride     = nStride;
	tempBuffer.m_nVertices   = nVertices;
	tempBuffer.m_nIndices    = nIndices;
	tempBuffer.m_nMaterialID = nMaterialID;

	void*			pData;

	// Now create the vertex buffer based on the stride and number of vertices specified.
	if (FAILED(m_pD3DDevice->CreateVertexBuffer(tempBuffer.m_nVertices * tempBuffer.m_nStride,
                                                      D3DUSAGE_WRITEONLY,
                                                      tempBuffer.m_dwFormat,
                                                      D3DPOOL_DEFAULT,
                                                      &tempBuffer.m_pVB,
                                                      NULL)))
	{
	}

	if (FAILED(tempBuffer.m_pVB->Lock(0, 0, (void**)&pData, 0)))
	{
	}

	memcpy(pData, pVertices, tempBuffer.m_nVertices * tempBuffer.m_nStride);

	tempBuffer.m_pVB->Unlock();

	if (tempBuffer.m_nIndices > 0)
	{
		if (FAILED(m_pD3DDevice->CreateIndexBuffer(tempBuffer.m_nIndices * sizeof(WORD),
                                                          D3DUSAGE_WRITEONLY,
                                                          D3DFMT_INDEX16, 

                                                          D3DPOOL_DEFAULT,

		                                          &tempBuffer.m_pIB,

	                                                  NULL)))
		{
		}
		if (FAILED(tempBuffer.m_pIB->Lock(0, 0, (void**)&pData, 0)))
		{
		}
		memcpy(pData, pIndices, tempBuffer.m_nIndices * sizeof(WORD));
		
		tempBuffer.m_pIB->Unlock();
	}

	(*pBufferID) = m_staticBuffers.size();

	m_staticBuffers.push_back(tempBuffer);

	m_Log << "A new static buffer has been created and added to the engine.";
	return S_OK;
}


Here is the code I use to render them

void CFNGED3D::Render(int nBufferID)
{
	if (nBufferID < 0 || nBufferID > m_staticBuffers.size())
		return;

	// Determine whether or not this computer supports vertex shaders.
	if (m_deviceCaps.m_isShaderSupport)
	{
		m_pD3DDevice->SetFVF(NULL);
	}
	else
	{
		m_pD3DDevice->SetFVF(m_staticBuffers[nBufferID].m_dwFormat);
	}

	//D3DMATERIAL9 mat;

	//mat.Diffuse.a  = m_materials[m_staticBuffers[nBufferID].m_nMaterialID].m_Diffuse.A;
	//mat.Diffuse.r  = m_materials[m_staticBuffers[nBufferID].m_nMaterialID].m_Diffuse.R;
	//mat.Diffuse.g  = m_materials[m_staticBuffers[nBufferID].m_nMaterialID].m_Diffuse.G;
	//mat.Diffuse.b  = m_materials[m_staticBuffers[nBufferID].m_nMaterialID].m_Diffuse.B;
	//mat.Ambient.a  = m_materials[m_staticBuffers[nBufferID].m_nMaterialID].m_Ambient.A;
	//mat.Ambient.r  = m_materials[m_staticBuffers[nBufferID].m_nMaterialID].m_Ambient.R;
	//mat.Ambient.g  = m_materials[m_staticBuffers[nBufferID].m_nMaterialID].m_Ambient.G;
	//mat.Ambient.b  = m_materials[m_staticBuffers[nBufferID].m_nMaterialID].m_Ambient.B;
	//mat.Specular.a = m_materials[m_staticBuffers[nBufferID].m_nMaterialID].m_Specular.A;
	//mat.Specular.r = m_materials[m_staticBuffers[nBufferID].m_nMaterialID].m_Specular.R;
	//mat.Specular.g = m_materials[m_staticBuffers[nBufferID].m_nMaterialID].m_Specular.G;
	//mat.Specular.b = m_materials[m_staticBuffers[nBufferID].m_nMaterialID].m_Specular.B;
	//mat.Emissive.a = m_materials[m_staticBuffers[nBufferID].m_nMaterialID].m_Emissive.A;
	//mat.Emissive.r = m_materials[m_staticBuffers[nBufferID].m_nMaterialID].m_Emissive.R;
	//mat.Emissive.g = m_materials[m_staticBuffers[nBufferID].m_nMaterialID].m_Emissive.G;
	//mat.Emissive.b = m_materials[m_staticBuffers[nBufferID].m_nMaterialID].m_Emissive.B;
	//mat.Power      = m_materials[m_staticBuffers[nBufferID].m_nMaterialID].m_Power;

	//m_pD3DDevice->SetMaterial(&mat);	
	//
	//// Assign each texture assigned to this material to each respective stage.
	//for (int i = 0; i < 5; i++)
	//{
	//	if (m_materials[nBufferID].m_textureID > -1)
	//	{
	//		LPDIRECT3DTEXTURE9 pTexture = (LPDIRECT3DTEXTURE9)m_textures[m_materials[m_staticBuffers[nBufferID].m_nMaterialID].m_textureID].m_pData;

	//		m_pD3DDevice->SetTexture(i, pTexture);
	//	}
	//}
	m_pD3DDevice->SetStreamSource(0, m_staticBuffers[nBufferID].m_pVB, 0, m_staticBuffers[nBufferID].m_nStride);

	if (m_staticBuffers[nBufferID].m_nIndices > 0)
	{
		m_pD3DDevice->SetIndices(m_staticBuffers[nBufferID].m_pIB);
		m_pD3DDevice->DrawIndexedPrimitive(D3DPT_TRIANGLELIST, 0, 0, m_staticBuffers[nBufferID].m_nVertices, 0, m_staticBuffers[nBufferID].m_nVertices / 3);
	}
	else 
	{
		m_pD3DDevice->DrawPrimitive(D3DPT_TRIANGLELIST, 0, m_staticBuffers[nBufferID].m_nVertices / 3);
	} 
}


With the code I am using right now for managing static buffers I can only rendering 12k vertices with a framerate of 20fps. I know that frames per second is a bad scale for judging program performance because it scales non-linearly. However I have seen many programs which can render 100x the amount of vertices I am rendering and still get a better frame rate. Does anybody know why I might be seeing this massive performance drop.
Advertisement
If you change D3DPOOL_DEFAULT to D3DPOOL_MANAGED and remove the D3DUSAGE_WRITEONLY flag, do you get an increase in performance?

Also, are you sure you're not calling CreateBuffer every frame?

Also, 12k triangles isn't too little. Unless you have a high-end card, it's possible you'd need other techniques to increase performance. What card is this on?
Sirob Yes.» - status: Work-O-Rama.
No I do not see a performance increase when I change the pool to managed and get rid of the write only flags. As a matter of fact I see a performance decrease when I do this.

Also I am running this on a fairly modern video card. Its a ATI Radeon 9600XT 128mb DDR.
How many times do you draw it per second?
(And how much space does it take on screen?)


Also, what happens if you only draw one polygon? (just the first one from the buffer)
Sirob Yes.» - status: Work-O-Rama.
I am drawing it only once per frame. So if I am getting ~20 frames per second than I am drawing 20 times per second. When I only draw 1 triangle I get roughly 3000-5000fps
To my previous question, how large is this object that you draw, and what resolution are you running, it is possible that you are limited by your fillrate.
Screenshot perhaps?


The object I am drawing is not that big. It fits perfectly in confines of the screen. Also the resolution I am running at is 640x480x16.
Sounds kinda strange I must say, I really don't know about this, but could it be that the card dislikes of drawing 12k tris in one batch? Sounds a bit strange, but I don't know, I've heard some talks about such... could be bullshit too.

Otherwise it sounds really strange unless you are doing some costy overhead such as lightning or so.


Yes, are you using an unusual number of texture stages, lights, or any costly effect that might be slowing you down?

If you make the model smaller (scale it down), do you gain much fps?
Sirob Yes.» - status: Work-O-Rama.

This topic is closed to new replies.

Advertisement