Jump to content
  • Advertisement
Sign in to follow this  
Dexario

OpenGL Rendering 2D tiles - performance issues

This topic is 930 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi,
 
I'm currently working on a 2D tile based game. I have to display ~3000 32x32 tiles every frame. However, I'm running into performance issues: the framerate drops to 36 FPS.
My code works the following way: I have one sprite instance per tile, I then store all the tiles for the map in a two dimensional array of Sprite pointers, which point to the tile they represent: std::vector<std::vector<Sprite*>>. To display the map, I iterate through all the pointers, set the right position and draw. Here's a minimalist piece of code which can represent what is globally going on:

/* Creating the map */
Sprite tile;
tile.load("res/sprites/tiles/tile_test32.png");

std::vector<std::vector<Sprite*>> map;

for (int x = 0; x < 64; x++)
{
    map.push_back(std::vector<Sprite*>());
    for (int y = 0; y < 48; y++)
        map[x].push_back(&tile);
}

....

/* Rendering */
for (int x = 0; x < map.size(); x++)
{
    for (int y = 0; y < map[x].size(); y++)
    {
        map[x][y]->setPosition(x * 32, y * 32);
        map[x][y]->draw();
    }
}

I tested this code but with SFML and its sf::Sprite class, it ran at about 240 FPS, instead of a miserable 36 FPS with my sprite class dry.png
 
Here is how my sprites are rendered:

void Sprite::draw()
{
    /* If the sprite is out of the screen, we don't render it */
    if (m_position.x + m_size.x * m_scale.x < 0 || m_position.y + m_size.y * m_scale.y < 0 || m_position.x > SCREEN_WIDTH || m_position.y > SCREEN_HEIGHT)
        return;

    if (m_texture == nullptr) return;

    m_texture->getShader()->use();

    /* Creating and sending the transformation matrices */
    glm::mat4 mvp = Transformable::getTransformationMatrix(SCREEN_WIDTH, SCREEN_HEIGHT);
    glUniformMatrix4fv(glGetUniformLocation(m_texture->getShader()->getProgram(), "mvp"), 1, GL_FALSE, glm::value_ptr(mvp));

    /* Sending the general color */
    glUniform4f(glGetUniformLocation(m_texture->getShader()->getProgram(), "generalColor"), m_color.r, m_color.g, m_color.b, m_color.a);

    m_texture->draw();
}

void Texture::draw()
{
    m_shader->use();

    /* Setting the texture */
    glActiveTexture(GL_TEXTURE0);
    glBindTexture(GL_TEXTURE_2D, m_texture);
    glUniform1i(glGetUniformLocation(m_shader->getProgram(), "tex"), 0);

    glEnable(GL_BLEND); /* For RGBA -> transparency */
    glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);

    /* Binding the right vao and drawing */
    glBindVertexArray(m_vao);

    glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_INT, 0);

    /* Unbind texture & VAO */
    glBindVertexArray(0);
    glBindTexture(GL_TEXTURE_2D, 0);
}

The shaders (vertex & fragment) are very minimalist. The vertex data carries a vec2 for position, a vec2 for texCoords and a vec4 for color, per vertex.
The Texture class contains the VAO, the VBO (=vertices), the EBO and the OpenGL texture. The Sprite class is basically just there for the transformations, all the rendering is done in the Texture class.

 

I saw on the internet that binding textures is very costly, I therefore tried to boil down the Texture::draw() function to glBindVertexArray(), glDrawElements() and glBindVertexArray(). It didn't change anything to the performance, even though no texture was applied.

 

Thanks.

Edited by Dexario

Share this post


Link to post
Share on other sites
Advertisement
The first thing I notice is that it looks you're issuing a draw call for every single tile. Not to mention (potentially) changing texture and shader state unnecessarily. I'm assuming, except for position, all these tiles are being drawn in essentially the same way. Instead of drawing once for every tile, just bind your shader and texture, submit _all_ the vertices and indices at once and issue a single draw call.

Share this post


Link to post
Share on other sites

Try avoiding these boilerplate code (gl_enable, gl_activeTexture, etc) callings for each tile since they won't change between these calls and set them before both of the for blocks too.

 

I would too keep m_texture->getShader()->getProgram() in a variable so I wouldn't call the get methods every time, the cost must be minimum since it is just a get operation, but access to memory is faster than method calling, but most importantly, cleans up the code.

Share this post


Link to post
Share on other sites

You can simply compare your code with SFML sources smile.png

 

https://github.com/SFML/SFML/blob/master/src/SFML/Graphics/Sprite.cpp

https://github.com/SFML/SFML/blob/master/src/SFML/Graphics/RenderTarget.cpp

 

As you can see there is almost no difference except:

1. SMFL caches states to prevent redundant state switching

2. It uses deprecated gl*Pointer function and it looks like it doesnt use VBO. If you have alot of different textures and each of them use its own VBO that may be the reason. At least I vaguely recall that I've faced similar issue when using glVertexPointer was cheaper that using alot of small VBO (but that was the case for an old gen iPhone). Sharing single VAO between all the sprites might help your case.

 

Apart from that there are alot of optimization that still can be done. There are some good threads regarding this topic here on gamedev.net. Basically, you should group draw calls by shader, textures, buffers to prevent state switching, draw from front to back, etc.

Edited by Alex Mekhed

Share this post


Link to post
Share on other sites

I am indeed drawing all the tiles separately, however, I also did that with SFML and it ran much faster. I looked at the code for rendering sprites in the SFML lib, it seems (to me) they are doing the same. I've broken down and analysed SFML's draw function (https://github.com/SFML/SFML/blob/master/src/SFML/Graphics/RenderTarget.cpp):

 

applyTransform(states.transform); -> applying matrices, transformations

|

applyBlendMode(states.blendMode); -> for RGBA

|

applyTexture(states.texture); -> binding the texture

|- leads to: glCheck(glBindTexture(GL_TEXTURE_2D, texture->m_texture));

|

applyShader(states.shader); -> using the correct shader

|

      const char* data = reinterpret_cast<const char*>(vertices);                              |

      glCheck(glVertexPointer(2, GL_FLOAT, sizeof(Vertex), data + 0));                  |-> binding the vertex data

      glCheck(glColorPointer(4, GL_UNSIGNED_BYTE, sizeof(Vertex), data + 8));  |

      glCheck(glTexCoordPointer(2, GL_FLOAT, sizeof(Vertex), data + 12));          |    

|

glCheck(glDrawArrays(mode, 0, vertexCount)); -> drawing

|

applyShader(NULL); -> unbinding the shader

|

applyTexture(NULL); -> unbinding the texture

 

I have to admit I am a bit confused since I can't see any signs of buffers (vao, vbo...), I guess they're using the old OpenGL. The draw() method just takes in an array of vertices and a struct called RenderStates, which contains all the info like shaders, transformations... and follows the steps detailed above.

Edited by Dexario

Share this post


Link to post
Share on other sites

I understand that keeping track of what texture is currently loaded to avoid redundancy would increase performance, however, if I change my draw() function to:

void Texture::draw()
{
	glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_INT, 0);
} 

The VAO & the texture are already bounded once at the startup and are never unbounded/rebounded after.

With that, I'm running at .... 39 FPS mellow.png , which represents an increase of 3 FPS.

Share this post


Link to post
Share on other sites

GL_TRIANGLE_STRIP instead of GL_TRIANGLES?

 

Another bottleneck might be here: Transformable::getTransformationMatrix(SCREEN_WIDTH, SCREEN_HEIGHT);

SFML caches transform matrices

Edited by Alex Mekhed

Share this post


Link to post
Share on other sites

You should set your 2d loop variables in your rendering section to proper values and avoid pointless iterations. Not sure how mush time it will gain you but it will some for sure.

 

Instead of :

/* Rendering */
for (int x = 0; x < map.size(); x++)
{
    for (int y = 0; y < map[x].size(); y++)
    {
        map[x][y]->setPosition(x * 32, y * 32);
        map[x][y]->draw();
    }
}

Try something like:

/* Rendering */
for (int x = pos.x - somewidth; x < pos.x + somewidth; x++)
{
    for (int y = pos.y - someheight; y < pos.y + someheight; y++)
    {
        map[x][y]->setPosition(x * 32, y * 32);
        map[x][y]->draw();
    }
}

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!