Display lists not really giving a performance boost

Started by
2 comments, last by BlueSpud 10 years, 6 months ago

Hello,

Recently, I've been stress testing my game engine after implementing my deferred renderer. I loaded in the Sponza model that Crytech provides for testing lighting in Cry Engine, even though many people use it for their own engines. The engine could handle one pass of the model at about 30 fps. So I popped into my 3D modeler and it was able to render 3 passes real time just fine. I also tried it in blender as well. I started to clock the different parts of the rendering and the lighting takes little to no time to do. I'm using FBOs, but I don't thing that would effect speed. I looked into ways to optimize mesh rendering and found display lists are the fastest. After putting them in, I didn't get much of a boost in fps. In the openGl settings of my 3D program, there is an option to use display lists, and I have it on so this left me puzzled. I've looked for a while now to find a solution, changing the state less often, etc. but nothing. So here is my code to create the display list:


 list = glGenLists(1);
        glNewList(list, GL_COMPILE);
        for (int i = 0; i < ModelRegistry.models[m].m.obj.size(); i++)
        {
            glBegin(GL_TRIANGLES);
            glNormal3f(ModelRegistry.models[m].m.obj[i].nx1, ModelRegistry.models[m].m.obj[i].ny1, ModelRegistry.models[m].m.obj[i].nz1);
            glTexCoord2f(ModelRegistry.models[m].m.obj[i].tx1, ModelRegistry.models[m].m.obj[i].ty1);
            glVertex3f(ModelRegistry.models[m].m.obj[i].x1,ModelRegistry.models[m].m.obj[i].y1,ModelRegistry.models[m].m.obj[i].z1);
            glNormal3f(ModelRegistry.models[m].m.obj[i].nx2, ModelRegistry.models[m].m.obj[i].ny2, ModelRegistry.models[m].m.obj[i].nz2);
            glTexCoord2f(ModelRegistry.models[m].m.obj[i].tx2, ModelRegistry.models[m].m.obj[i].ty2);
            glVertex3f(ModelRegistry.models[m].m.obj[i].x2,ModelRegistry.models[m].m.obj[i].y2,ModelRegistry.models[m].m.obj[i].z2);
            glNormal3f(ModelRegistry.models[m].m.obj[i].nx3, ModelRegistry.models[m].m.obj[i].ny3, ModelRegistry.models[m].m.obj[i].nz3);
            glTexCoord2f(ModelRegistry.models[m].m.obj[i].tx3, ModelRegistry.models[m].m.obj[i].ty3);
            glVertex3f(ModelRegistry.models[m].m.obj[i].x3,ModelRegistry.models[m].m.obj[i].y3,ModelRegistry.models[m].m.obj[i].z3);
            glEnd();
        }
            glEndList();

And here is my rendering code:


void object::renderObjectWithProgram(GLuint program)
{
    //default values for the textures
    glUniform1i(glGetUniformLocation(program,"texture"),0);
    glUniform1i(glGetUniformLocation(program,"specularTexture"),1);
    glUniform1i(glGetUniformLocation(program,"normalTexture"),2);
    
    renderCamera();
    glUseProgram(program);
    glScalef(scaleFloat, scaleFloat, scaleFloat);
    glTranslatef(-x, -y, -z);

    glActiveTexture(GL_TEXTURE0);
    glBindTexture(GL_TEXTURE_2D, textureID);
    glActiveTexture(GL_TEXTURE1);
    glBindTexture(GL_TEXTURE_2D, specularID);
    glActiveTexture(GL_TEXTURE2);
    glBindTexture(GL_TEXTURE_2D, normalID);

    glDisable(GL_LIGHTING);
    /*
    for (int i = 0; i < ModelRegistry.models[m].m.obj.size(); i++)
    {
        glBegin(GL_TRIANGLES);
        glNormal3f(ModelRegistry.models[m].m.obj[i].nx1, ModelRegistry.models[m].m.obj[i].ny1, ModelRegistry.models[m].m.obj[i].nz1);
        glTexCoord2f(ModelRegistry.models[m].m.obj[i].tx1, ModelRegistry.models[m].m.obj[i].ty1);
        glVertex3f(ModelRegistry.models[m].m.obj[i].x1,ModelRegistry.models[m].m.obj[i].y1,ModelRegistry.models[m].m.obj[i].z1);
        glNormal3f(ModelRegistry.models[m].m.obj[i].nx2, ModelRegistry.models[m].m.obj[i].ny2, ModelRegistry.models[m].m.obj[i].nz2);
        glTexCoord2f(ModelRegistry.models[m].m.obj[i].tx2, ModelRegistry.models[m].m.obj[i].ty2);
        glVertex3f(ModelRegistry.models[m].m.obj[i].x2,ModelRegistry.models[m].m.obj[i].y2,ModelRegistry.models[m].m.obj[i].z2);
        glNormal3f(ModelRegistry.models[m].m.obj[i].nx3, ModelRegistry.models[m].m.obj[i].ny3, ModelRegistry.models[m].m.obj[i].nz3);
        glTexCoord2f(ModelRegistry.models[m].m.obj[i].tx3, ModelRegistry.models[m].m.obj[i].ty3);
        glVertex3f(ModelRegistry.models[m].m.obj[i].x3,ModelRegistry.models[m].m.obj[i].y3,ModelRegistry.models[m].m.obj[i].z3);
        glEnd();

    }
     */
    glCallList(list);

    glLoadIdentity();
    glBindTexture(GL_TEXTURE_2D, 0);
    glActiveTexture(GL_TEXTURE0);
    glUseProgram(0);
}

The data comes from a std::vector, I don't know if that could be the problem, the internet doesn't seem to say anything about it. Any help would be great, thanks.

Advertisement

This doesn't look like you are using FBOs.

You should really switch to VBOs and FBOs.

They give the best performance boost.

My Oculus Rift Game: RaiderV

My Android VR games: Time-Rider& Dozer Driver

My browser game: Vitrage - A game of stained glass

My android games : Enemies of the Crown & Killer Bees

This depends on how your driver implements display lists. It's perfectly valid for an OpenGL driver to implement display lists as a simple "record and replay" mechanism in system memory, so your glCallLists call may be doing nothing much more than just triggering a bunch of glBegin/glEnds. The moral of the story is that OpenGL doesn't promise performance, just functionality.

On the other hand, even your non-display-list code could be significantly improved, and without disrupting it too much. Yes, moving to a VBO will ultimately be the preferred option that works most consistently across all drivers, but there are some things you can do that will get you enhanced performance even without that.

You've got a fairly classic OpenGL anti-pattern going on here, and it looks like this:


	for (i = 0; i < somenumber; i++)
	{
		glBegin (GL_TRIANGLES);
                // glVertex/etc calls
		glEnd ();
	}

The simplest performance improvement for you right now would be to move the glBegin and glEnd outside of the loop; you're not changing state inside the loop, and glBegin with GL_TRIANGLES is capable of drawing more than one triangle, so the code becomes:


	glBegin (GL_TRIANGLES);

	for (i = 0; i < somenumber; i++)
	{
                // glVertex/etc calls
	}

	glEnd ();

That should get you quite a sizable jump in your framerate, but be aware that you're going to have bottlenecks elsewhere too, such as fragment shading and/or blending, which no amount of optimization on vertex storage or submission is going to help with.

Regarding VBOs, and from the looks of your code I'm guessing that you're using the .obj file format, so you're going to run into a problem that everyone who uses .obj eventually encounters - .obj is optimized for storage, not drawing. It's important to realize that optimizing for storage won't necessarily give you the best drawing performance, as they're two different classes of optimization, and in this case your drawing performance is also being quite badly affected by cache-thrashing. When you switch to using VBOs you're going to need to unpack all of your data, so you're going to find that your memory usage will increase quite a lot. That's nothing to worry about (unless you blow your video RAM budget), but it's normal enough to see people expressing concern over it. Just remember that optimization for storage is not the same as optimization for drawing, that sometimes they have conflicting goals, and that the particular case of the .obj format is one of those times.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

This depends on how your driver implements display lists. It's perfectly valid for an OpenGL driver to implement display lists as a simple "record and replay" mechanism in system memory, so your glCallLists call may be doing nothing much more than just triggering a bunch of glBegin/glEnds. The moral of the story is that OpenGL doesn't promise performance, just functionality.

On the other hand, even your non-display-list code could be significantly improved, and without disrupting it too much. Yes, moving to a VBO will ultimately be the preferred option that works most consistently across all drivers, but there are some things you can do that will get you enhanced performance even without that.

You've got a fairly classic OpenGL anti-pattern going on here, and it looks like this:


	for (i = 0; i < somenumber; i++)
	{
		glBegin (GL_TRIANGLES);
                // glVertex/etc calls
		glEnd ();
	}

The simplest performance improvement for you right now would be to move the glBegin and glEnd outside of the loop; you're not changing state inside the loop, and glBegin with GL_TRIANGLES is capable of drawing more than one triangle, so the code becomes:


	glBegin (GL_TRIANGLES);

	for (i = 0; i < somenumber; i++)
	{
                // glVertex/etc calls
	}

	glEnd ();

That should get you quite a sizable jump in your framerate, but be aware that you're going to have bottlenecks elsewhere too, such as fragment shading and/or blending, which no amount of optimization on vertex storage or submission is going to help with.

Regarding VBOs, and from the looks of your code I'm guessing that you're using the .obj file format, so you're going to run into a problem that everyone who uses .obj eventually encounters - .obj is optimized for storage, not drawing. It's important to realize that optimizing for storage won't necessarily give you the best drawing performance, as they're two different classes of optimization, and in this case your drawing performance is also being quite badly affected by cache-thrashing. When you switch to using VBOs you're going to need to unpack all of your data, so you're going to find that your memory usage will increase quite a lot. That's nothing to worry about (unless you blow your video RAM budget), but it's normal enough to see people expressing concern over it. Just remember that optimization for storage is not the same as optimization for drawing, that sometimes they have conflicting goals, and that the particular case of the .obj format is one of those times.

Thanks for the very insightful post. Eventually, I'm going to create an optimized format for the engine. Moving the glBegin() and glEnd() out of the loop gave me a boost of about 800%, even just with the display loop. If I could, I'd give you more reputation ;)

This topic is closed to new replies.

Advertisement