Optimising my renderer

Started by
41 comments, last by Hodgman 9 years, 12 months ago
Hi Guys,

For a while now I have been working on my own framework to replace my (unneccesary) reliance on Game Maker: Studio

Today I decided to see how me engine benchmarks against GM:S. Might code is reasonably tight (so I thought) but GM's renderer still runs rings around mine. GM:S also uses DirectX 9c too.

These are the results;


# Sprites (256x256)      Mine      GM:S 1.3

1                        1570      ~1350
10                       706       ~1350
100                      106       ~1100
500                      25        ~620
1000                     13        ~450
My renderer is faster when displaying one sprite, but then drops off quite rapidly.

I am using the ID3DXSprite interface to create and render the sprites. This is my entire render code. The sprites are all stored in a vector and use the same image (for testing purposes).


void Renderer::renderSpriteQueue()
{
	SpriteSortByDepth();

	pSprite->Begin(D3DXSPRITE_ALPHABLEND);
	std::vector<Sprite>::iterator it;

	for(it=vSprite.begin();it<vSprite.end();it++)
	{
		RECT rectSpriteTextureArea;
		D3DXVECTOR3 v3Center;
		D3DXVECTOR3 v3Position;

		rectSpriteTextureArea.top=0;
		rectSpriteTextureArea.bottom=it->nSizeY;;
		rectSpriteTextureArea.left=0;
		rectSpriteTextureArea.right=it->nSizeX;
		v3Center=D3DXVECTOR3(0,0,0);
		v3Position=D3DXVECTOR3(it->fPosX,it->fPosY,0);

		if(FAILED(pSprite->Draw(pTexture,&rectSpriteTextureArea,&v3Center,&v3Position,0xFFFFFFFF)))
			MessageBox(NULL,"Error","Error",NULL);
	}
	pSprite->Flush();
	pSprite->End();
}
Am I doing something in-efficiently here? Would it be faster to just use a textured quad instead?

Any advice would be awesome smile.png
Advertisement
What's the speed without the sort?
Identical. I tried that out earlier. smile.png
Am I doing something in-efficiently here? Would it be faster to just use a textured quad instead?

Probably.. but at only 1000 sprites it's quite surprising to see such a huge drop in performance. Do the sprites cover the same amount of screen space in both tests?

Your test seems to scale pretty linearly over the number of sprites, which indicates that the problem is either in setup per sprite, or in fillrate.

If the sprites completely cover each other, perhaps GM optimizes away those behind. Try with like 2x2 sprites instead of 256x256 to confirm whether it can be fillrate.

Yeah, the scenes are setup identically, so screen coverage is the same.

I am also in the process of trying with textured quads but am having trouble applying a texture to a single triangle (as I haven't used textured triangles before - I have another topic in this forum for that issue though). I can render the triangle ok, but cant apply a texture (or don't properly know how to smile.png )

A triangle is half of a quad, texture is just a matter of computing/assigning the correct texture coordinates. If you visualize a quad as being composed of 2 triangles then that should go a long way in figuring out how to assign the correct texture coordinates.

A triangle is half of a quad, texture is just a matter of computing/assigning the correct texture coordinates. If you visualize a quad as being composed of 2 triangles then that should go a long way in figuring out how to assign the correct texture coordinates.


Yeah, I can visualise how the uv's should be as it would be a simple 0 & 1 thing.

This is what I have, but I am just getting a white triangle (instead of a triangle with a png on it)


LPDIRECT3DVERTEXBUFFER9 pVertexObject = NULL;
void *pVertexBuffer = NULL; 

struct D3DVERTEX{
				float x,y,z,rhw;
				DWORD color;
				float u;
				float v;
					} vertices[3]; 

vertices[0].x = 50; 
vertices[0].y = 50; 
vertices[0].z = 0; 
vertices[0].rhw = 1.0f; 
vertices[0].color = 0xffffff;
vertices[0].u=0.0;
vertices[0].v=0.0;

vertices[1].x = 250; 
vertices[1].y = 50; 
vertices[1].z = 0; 
vertices[1].rhw = 1.0f; 
vertices[1].color = 0xffffff; 
vertices[1].u=1.0;
vertices[1].v=0.0;

vertices[2].x = 50; 
vertices[2].y = 250; 
vertices[2].z = 0; 
vertices[2].rhw = 1.0f;
vertices[2].color = 0xffffff;
vertices[2].u=0.0;
vertices[2].v=1.0;

if(FAILED(mRenderer->getDevice()->CreateVertexBuffer(3*sizeof(D3DVERTEX),0,D3DFVF_XYZRHW|D3DFVF_DIFFUSE|D3DFVF_TEX0,D3DPOOL_DEFAULT,&pVertexObject,NULL)))
	return(0);
 
if(FAILED(pVertexObject->Lock(0,3*sizeof(D3DVERTEX),&pVertexBuffer,0)))
	return(0);
memcpy(pVertexBuffer, vertices, 3*sizeof(D3DVERTEX));
pVertexObject->Unlock();

// do the actual render
mRenderer->getDevice()->SetStreamSource(0,pVertexObject,0,sizeof(D3DVERTEX));
mRenderer->getDevice()->SetFVF(D3DFVF_XYZRHW|D3DFVF_DIFFUSE);
mRenderer->getDevice()->DrawPrimitive(D3DPT_TRIANGLELIST,0,1);

mRenderer->getDevice()->SetTexture(0,mRenderer->pTexture);
Yes, doing this all in the draw call is nasty. I will clean this up once I get it texturing properly.

[edit]
Found out what was happening there

Third last line should be mRenderer->getDevice()->SetFVF(D3DFVF_XYZRHW|D3DFVF_DIFFUSE|D3DFVF_TEX1);
Ok, more results smile.png

I have now tested with a textured quad and here are the results

Rendered Sprites (256x256)	ID3DXSPRITE	Quad		GM:S 1.3

0				1740		1740		~1400	
1 				1570		1740		~1350
10				706		1209		~1350
100				106		297		~1100
500				25		68		~620
1000				13		35		~450
So, the results are much better (~double) when using a 'Quad' but the results a still far below GM:S.

Under heavy load GM:S is still ~10x quicker. How can that be?

Am I missing something here?
I have absolutely stripped out my render phase so this is all that is happening

// render 1000 objects
for(int i=0;i<1000;i++)
{
	mRenderer->getDevice()->DrawPrimitive(D3DPT_TRIANGLESTRIP,0,2);
}

// display framerate data
lps=mRenderer->framerateGetReal();
lps=lps-1;
if(lps<0)
	lps=0;

itoa(lps,szBuffer,10);
strcpy(szBuffer2,"Frame Rate: ");
strcat(szBuffer2,szBuffer);
strcat(szBuffer2," FPS");
mRenderer->renderDebugText(600,14,szBuffer2);
I guess it is possible that the way I am rendering the frame counter might be a bottle-neck (it uses ID3DXFONT). I might strip that out and see what I can gain.

Interesting stuff smile.png
Hmmm, still only 35 FPS if my entire render cycle is just this


for(int i=0;i<1000;i++)
        mRenderer->getDevice()->DrawPrimitive(D3DPT_TRIANGLESTRIP,0,2);

So, I must be missing some magic somewhere?

How can GM:S be faster than two lines of render code?

This topic is closed to new replies.

Advertisement