Help with improving drawing in DX9

Started by
5 comments, last by kauna 12 years, 4 months ago
We're working on a 2D game and we use DirectX 9.0c for graphics. We've discovered that our drawing functions takes way to much resources and they are slowing the game down, a lot. If there's 15 tiles on the map the FPS drops to around 30. We aren't sure how we are suppose to do the drawing process in a good way. We got a solution that works but it feels like it's not the ideal one.

We got a global graphics class that each object can reach and use in their draw functions. It looks like this:


class Graphics
{
public:
// Draws a texture
void drawTexture(IDirect3DTexture9 *texture, float x, float y, int width, int height, float rotation = 0);
private:

// Buffer for vertices of the type TextureVertex
IDirect3DVertexBuffer9* mVB_texture;
}



Graphics::drawTexture(...) looks like this:

void Graphics::drawTexture(IDirect3DTexture9 *texture, float x, float y, int width, int height, float rotation)
{
// Texture coordinates range from 0-1, 0 is the most left and 1 the most right in the X direction for example
Rect drawRect;
drawRect.left = x-(width/2);
drawRect.right = x+(width/2);
drawRect.top = y-(height/2);
drawRect.bottom = y+(height/2);

gd3dDevice->SetVertexDeclaration(TextureVertex::Decl);
gd3dDevice->SetStreamSource(0, mVB_texture, 0, sizeof(TextureVertex));

TextureVertex *vertices = 0;
mVB_texture->Lock(0, 0, (void**)&vertices, 0);

// Setup vertices
vertices[0].pos.x = (float) drawRect.left;
vertices[0].pos.y = (float) drawRect.top;
vertices[0].pos.z = 0;
vertices[0].tex0.x = 0.0f;
vertices[0].tex0.y = 0.0f;

vertices[1].pos.x = (float) drawRect.right;
vertices[1].pos.y = (float) drawRect.top;
vertices[1].pos.z = 0;
vertices[1].tex0.x = 1.0f;
vertices[1].tex0.y = 0.0f;

vertices[2].pos.x = (float) drawRect.right;
vertices[2].pos.y = (float) drawRect.bottom;
vertices[2].pos.z = 0;
vertices[2].tex0.x = 1.0f;
vertices[2].tex0.y = 1.0f;

vertices[3].pos.x = (float) drawRect.left;
vertices[3].pos.y = (float) drawRect.bottom;
vertices[3].pos.z = 0;
vertices[3].tex0.x = 0.0f;
vertices[3].tex0.y = 1.0f;

// Unlock the vertex buffer
mVB_texture->Unlock();

// Set texture
gd3dDevice->SetTexture (0, texture);

// Draw content in buffer
gd3dDevice->DrawPrimitive(D3DPT_TRIANGLEFAN, 0, 2);
gd3dDevice->SetTexture (0, NULL);
}



An example when using drawTexture(...) :


Player::draw()
{
gGraphics->drawTexture(getTexture(), getX(), getY(), getWidth(), getHeight());
}



So we basicly uses the same vertex buffer for every object in the game and just change it's vertex attributes. It seems like all the calls to the Lock(...) and Unlock() functions is one of the reasons to the slow down.

We've thought about each object having their own vertex buffer and only change it when the object moves, scales or rotates. Is that a viable way to do it?

What's your opinion about our solution? What can we change and how would you write an efficient drawing algorithm?

Cheers :)
Advertisement
You don't need to change the vertices when moving, rotating or scaling the sprite, just alter the sprite's world matrix for that!

Hope this helps, ask if something remains unclear!

EDIT: Oh and are you creating the vertex buffer with D3DUSAGE_DYNAMIC? You should if you end up accessing the vertex buffer a lot during runtime!
calling
gd3dDevice->SetTexture (0, texture);

Every time you draw something is a huge bottle neck.
If you can draw your scene by only binding your texture only a few times you will get a huge FPS increase.


You don't need to change the vertices when moving, rotating or scaling the sprite, just alter the sprite's world matrix for that!


Oh right. But will that work when the drawing calls are between a [color=#000066][font=Fixedsys, monospace][size=2]LPD3DXSPRITE::Begin() and End()?[/font]
[font="Arial"]The reason for all the draw calls being within Begin() and End() is to get 2D screen coordintes, without mixing with the projection matrix.[/font]
[color=#000066][font="Arial"]
[/font]
[font="Arial"]Should each object have their own vertex buffer?[/font]
[color="#000066"][font="Arial"]
[/font]

I have almost no experience with the D3DXSPRITE interface unfortunately but I'm pretty sure you can't use your 'average' world matrix transformations with it.

I personally prefer to work in 3D using an orthogonal projection. You can use the z to control the order in which elements are drawn. Your screen space will of course be from -1/-1 to 1/1 so if you want to continue using your 2D coordinate system, so to speak, you need to convert coordinates before drawing:


// convert coordinates from range [0,screen_size] to [-1,1]
pixel.x = ( pixel.x / screen_width ) * 2.0f - 1.0f;
pixel.y = ( pixel.y / screen_height ) * 2.0f - 1.0f;


With that method, of course, you can transform your sprites as you would in a standard 3D game using the world * view * projection matrix pipeline.

I am sure, however, that you should not actually suffer performance problems from rebuilding a couple dynamic (!) vertex buffers every frame like you describe you are, so I'm not sure what to suggest assuming you want to stay within the D3DXSPRITE interface.

[quote name='d k h' timestamp='1323697440' post='4893105']
You don't need to change the vertices when moving, rotating or scaling the sprite, just alter the sprite's world matrix for that!


Oh right. But will that work when the drawing calls are between a [color="#000066"][font="Fixedsys, monospace"]LPD3DXSPRITE::Begin() and End()?[/font]
[font="Arial"]The reason for all the draw calls being within Begin() and End() is to get 2D screen coordintes, without mixing with the projection matrix.[/font]
[color="#000066"][font="Arial"] [/font]
[font="Arial"]Should each object have their own vertex buffer?[/font]
[color="#000066"][font="Arial"] [/font]


[/quote]
The other problem is the lock on the VB, ideally you would like to construct your VB's at the end of your loading process and never ever touch them again. Sending over data from Main memory/CPU to GPU is a slow process and a major bottleneck.

Next to that you might want to set the VB to created with Dynamic as well as that tells the DX runtime at least that your going to change it often.

The oher thing is that GPU's like big vertex buffers to draw alot in one go, on a home project I had a HD4850 fall over when I was drawing 500-1000 (about 0.5 Million polys) planes each in their own VB. When I switched to a batched version all my performance issues went away. You can bunch up all your tile data for one texture in one big vertex buffer and draw all of that in one go, this will be your biggest performance gain.

Worked on titles: CMR:DiRT2, DiRT 3, DiRT: Showdown, GRID 2, theHunter, theHunter: Primal, Mad Max, Watch Dogs: Legion


The other problem is the lock on the VB, ideally you would like to construct your VB's at the end of your loading process and never ever touch them again. Sending over data from Main memory/CPU to GPU is a slow process and a major bottleneck.


Slow process here is rather relative concept. I have been able to push several megabytes of data per frame (under D3D9) while maintaining interactive >60fps speed and that was few years ago. The data transfer rate per second was easily 100 megabytes and 250 megabytes at top. That was few years ago.
Of course, it is always better to have static buffers.

When drawing few hundres 2D tiles, there is no way that you are able to saturate your CPU to GPU transfer, unless you are doing something really wrong.

Consider following:

- collect all tiles / objects using same texture(s)
- set vertex buffer once

For each different tile/object group:
- set texture once
- set shader once
- draw the tiles (which use the same texture) with one draw command. Either construct the vertex buffer every draw call or use instancing

Important part here is to reduce D3D api calls and improve batching. Even simplest form of drawing many object at once should bring you a lot of performance.

Cheers!

This topic is closed to new replies.

Advertisement