Jump to content
  • Advertisement
Sign in to follow this  
Mohanddo

DX11 [DX11] Tile map performance

This topic is 2651 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi

I have been working on 2d tile map loading and rendering and im noticing very bad performance with my simple approach.

Basically i have a 2d array which specify the tiles in the map using a number to represent each type of tile (water, rock etc.) Each of these are a sprite of its own with a vertex buffer and a texture. All tiles are 32x32.

This is the render function of the map:


XMMATRIX VPMatrix = g_Game->GetViewProjectionMatrix();
ID3D11Buffer* MVPBufferHandle = g_Game->GetMVPBuffer();
ID3D11DeviceContext* D3DContext = g_Game->GetContext();
XMMATRIX world = GetWorldMatrix();

for(int i = 0; i < m_TileList.size(); i++)
{
for(int j = 0; j < m_TileList.size(); j++)
{
XMMATRIX offset = XMMatrixMultiply(world, XMMatrixTranslation(j * m_TileSize, i * m_TileSize, 0.0f));
XMMATRIX mvp = XMMatrixMultiply(offset, VPMatrix);
mvp = XMMatrixTranspose(mvp);

D3DContext->UpdateSubresource(MVPBufferHandle, 0, 0, &mvp, 0, 0);
D3DContext->VSSetConstantBuffers(0, 1, &MVPBufferHandle);
m_SpriteList[m_TileList[j]].Render();
}
}



Simply put i get the pointer for my constant buffer to send my final matrix to the shader, I multiply the location of each tile by the location of the map and the projectionview matrix and send it off for rendering. The render function of a sprite just binds the vertex buffer and the texture resource and renders a quad(using triangle list).

As an example i tried to render 100 tiles with only 2 types of sprites.
without map = 500 fps;
with map = ~250 fps;

This looks to me like very bad performance probably due to my approach but i have no idea how i can couple things together to save draw calls or texture binds sad.gif

Can anyone guide me on how i can increase the performance?

Thanks

Share this post


Link to post
Share on other sites
Advertisement
As an example i tried to render 100 tiles with only 2 types of sprites.
without map = 500 fps;
with map = ~250 fps;

This looks to me like very bad performance probably due to my approach but i have no idea how i can couple things together to save draw calls or texture binds sad.gif

Can anyone guide me on how i can increase the performance?
Yes. Add more work. Surprised?
Modern drivers are not just "hardware translators" of API commands. They go trough extensive buffering and mangling, based on statistical analysis of real-world workloads. If you don't give them "real world" workload, or give them some "really odd" pattern of commands, they won't gear up. Much less they'll drive the hardware properly.

Therefore, don't even start talking about performance unless you're below 100 (and even this is a real stretch). Framerates such as 500 fps... or 200 fps for that matter, are just ridiculous and hardly indicative of a "real world" performance problem.

But, if I would be in you, I'd just pre-transform all tiles in a big batch as I hardly believe they move at all with regard to each other, no tile is an island.

Share this post


Link to post
Share on other sites
For 2D tile layers, use a texture with NN sampling for deciding what part of a texture atlas to render. Then you can have thousands of tiles by rendering 2 triangles with an HLSL shader.

Share this post


Link to post
Share on other sites
When using a single texture atlas, you can also draw everything with a single draw call if you use hardware instancing.

Share this post


Link to post
Share on other sites
Thanks for the suggestions. I pre transform my sprites on load and any time the map is moved now which gave me a decent fps boost (~30 fps).
Unfortunately im not using sprite sheets and i have a separate image for each entity to keep it simple. Later i can create a system where my individual sprites would be compiled into a large image when i create a map editor.

Also what is NN sampling?

Share this post


Link to post
Share on other sites
There's a lot of things you can improve, though as Krohm already mentioned, I'd only worry about it if performance actually becomes an issue.
Here's a more detailed list of things you can do to (greatly) improve your performance:

  • Use a single quad (2 triangles) to render all your tiles. That's really all you need.
    Your quad should have the size of a single tile and be created at the origin of your world space.
    Whenever you draw a tile, use a vertexshader to move the quad to the appropriate location by providing a WVP matrix.

    This will reduce your total vertex count and eliminate the need to update your vertex buffer every frame.
    You can now also set the vertex buffer to default or immutable.
  • Use frustum culling.
    You only need to draw the tiles that are actually visible on screen.
  • Put all your tile images into one big texture (texture atlas).
    I understand that you might not want to manually do that yet, but you can easily have your program do it for you at startup.
    Just calculate the texture size needed hold all of your tiles and create your texture atlas using it.
    Then render every tile to your new texture atlas, and keep track of the UV location for every single tile.
    Now you can render all of your tiles using the very same texture.
    Instead of switching textures you switch UV coordinates.

    This greatly cuts down your state changes.
  • If you want even more performance, do everything in a single draw call using hardware instancing.
    You'll need to create a second (dynamic) vertexbuffer that holds the WVP matrix and UV data for every tile to be rendered.

    This will reduce your draw calls down to one.
    At this point you can easily render over 10k tiles without performance issues.

Share this post


Link to post
Share on other sites

There's a lot of things you can improve, though as Krohm already mentioned, I'd only worry about it if performance actually becomes an issue.
Here's a more detailed list of things you can do to (greatly) improve your performance:

  • Use a single quad (2 triangles) to render all your tiles. That's really all you need.
    Your quad should have the size of a single tile and be created at the origin of your world space.
    Whenever you draw a tile, use a vertexshader to move the quad to the appropriate location by providing a WVP matrix.

    This will reduce your total vertex count and eliminate the need to update your vertex buffer every frame.
    You can now also set the vertex buffer to default or immutable.
  • Use frustum culling.
    You only need to draw the tiles that are actually visible on screen.
  • Put all your tile images into one big texture (texture atlas).
    I understand that you might not want to manually do that yet, but you can easily have your program do it for you at startup.
    Just calculate the texture size needed hold all of your tiles and create your texture atlas using it.
    Then render every tile to your new texture atlas, and keep track of the UV location for every single tile.
    Now you can render all of your tiles using the very same texture.
    Instead of switching textures you switch UV coordinates.

    This greatly cuts down your state changes.
  • If you want even more performance, do everything in a single draw call using hardware instancing.
    You'll need to create a second (dynamic) vertexbuffer that holds the WVP matrix and UV data for every tile to be rendered.

    This will reduce your draw calls down to one.
    At this point you can easily render over 10k tiles without performance issues.


Thanks, im going to implement all that you have said soon but first i have to make a decision on what is a sprite and an entity so i can conveniently change texture and buffer states. I guess that's a design problem.

Anyway many thanks to everyone that helped

Share this post


Link to post
Share on other sites
NN sampling is when you sample a texture with "Nearest Neighbour" interpolation so that the textures looks like in Wolfenstein3D. It is faster than using bilinear interpolation between the 4 closest points.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!