Rendering and CPU usage

This topic is 3847 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

Recommended Posts

I'm using OpenGL for a renderer and it seems that a LOT of CPU is eaten up with objects that don't have that many triangles. About 5% is used for an object with 100 triangles and... 100% for a 1600 triangle object! I'm using this rendering code for each object (at 50fps): (the "Zone3D" object loops through all the objects in the "zone" and renders them)
void Zone3DClass::Draw()
{
std::list<RModelClass*>::iterator iter1 = render_models.begin(); // models
std::list<TextureClass*>::iterator iter2 = texture.begin(); // textures for each model
std::list<double>::iterator iter4 = posX.begin(); // x for each obj
std::list<double>::iterator iter5 = posY.begin(); // y for each obj
std::list<double>::iterator iter6 = posZ.begin(); // z for each obj
std::list<double>::iterator iter7 = alpha.begin(); // alpha (not works yet)

for (unsigned int ii = 0; ii < rmodels; ii++)
{
RModelClass * model = *(iter1);
TextureClass * tex = *(iter2);
double pX = *(iter4);
double pY = *(iter5);
double pZ = *(iter6);
double al = *(iter7);

glTranslatef( (GLfloat)pX, (GLfloat)pY, (GLfloat)pZ );

glEnable( GL_REPEAT );
glColor4f( 1.0f, 1.0f, 1.0f, 1.0f );
glEnable( GL_TEXTURE_2D );

tex->SetTexture();
glBegin( GL_TRIANGLES );
for (unsigned int ii = 0; ii < model->triNum; ii++)
{
for (unsigned int jj = 0; jj < 3; jj++)
{
glNormal3d( model->triangle[ii]->normal[jj]->vector.x, model->triangle[ii]->normal[jj]->vector.y, model->triangle[ii]->normal[jj]->vector.z );
glTexCoord2d( model->triangle[ii]->vertex[jj]->tex.x, model->triangle[ii]->vertex[jj]->tex.y );
glVertex3d( model->triangle[ii]->vertex[jj]->point.x, model->triangle[ii]->vertex[jj]->point.y, model->triangle[ii]->vertex[jj]->point.z );
}
}
glEnd();

iter1++;
iter2++;
iter3++;
iter4++;
iter5++;
iter6++;
iter7++;
}
}


Is there anything that looks bad in there? I don't know why it would slow down so much.

Share on other sites
There are other things that can be changed, but the first and biggest change that can be made is to make a transition from specifying individual vertices to storing these vertices in vertex buffers. Nehe's 45th tutorial covers the topic. This is *THE* way to render figures with large vertex counts [1600 isn't that large, but certainly large enough to benefit from this method]. This change will reduce the CPU load specificly by storing vertices on the graphics card instead of having to constantly re-construct the vertex arrays. This method is used for static geometry [stuff that doesn't animate] but can be used for animated geometry with a few higher-level tricks [namely use of vertex shaders].

The part of your code that will be replaced by this method is this :
		glBegin(GL_TRIANGLES);		for (unsigned int ii = 0; ii < model->triNum; ii++)		{			for (unsigned int jj = 0; jj < 3; jj++)			{				glNormal3d( model->triangle[ii]->normal[jj]->vector.x,					model->triangle[ii]->normal[jj]->vector.y, 					model->triangle[ii]->normal[jj]->vector.z );				glTexCoord2d( model->triangle[ii]->vertex[jj]->tex.x, 					model->triangle[ii]->vertex[jj]->tex.y );				glVertex3d( model->triangle[ii]->vertex[jj]->point.x, 					model->triangle[ii]->vertex[jj]->point.y, 					model->triangle[ii]->vertex[jj]->point.z );			}		}		glEnd();

Share on other sites
Thanks for the tip, I'll start with doing that. But what about animated objects though? The 1600-triangle object was actually a character. And on top of rendering I'll have collision detection which will take some time.

On a side note, I haven't added in anything like culling yet, so that may explain a little bit.

EDIT: Culling cuts the CPU down to 50% (with the 1.6k tri model).

[Edited by - Gumgo on June 3, 2007 7:48:56 PM]

Share on other sites
You might want to to consider switching to VBOs or at least VAs, as you seem to burn a lot of cycles with all the glVertex, glTexCoord, and glNormal function calls. Reducing the number of API calls will most likely lower the CPU usage of your render loop a lot.

Share on other sites
Thanks, I'm following the tutorial on NeHe to switch to VBO (for static geometry, or at least I will be when I finish all this darn homework). I don't think I've heard of VAs though. What are those?

EDIT: Wait, nevermind. I remembered what they were.

Share on other sites
Culling, good idea in ALL situations.

VBO = vertex buffer object.
VA = vertex array.

VA's can be thought of as VBO's that are kept in system memory and are not seen by open GL until you actually do some drawing. This means they are decent to use with animated meshes and such, but are still no where near as efficient as using VBO's [ESPECIALLY for geometry that can be used in static storage]. VA's don't get you that much in terms of efficiency though, over using the old primitives like glVertex and glNormal. If you can figure out a way to make your geometry expressable in static storage, you are always better off.

Animating things aren't always so easy to express as static storage though, obviously, so VA's might be a decent option, but you're still usually best off using VBO's. To use VBO's for drawing animated things, consider using GL_STREAM_DRAW instead of GL_STATIC_DRAW when working with the buffer. This will flag the memory as being used once, then replaced [which is what you want with animation]. Keep in mind that a VBO that uses GL_STREAM_DRAW will run [sometimes much] slower than one that uses GL_STATIC_DRAW, so you should still use static draw when ever you can.

Considering what you're working with at the moment, it is likely not a good idea to jump straight into shaders just yet, but for the sake of completeness, shaders are the way that you would be able to express an animating figure through static storage. [just to keep in mind for future learning]

Share on other sites
Hmm... so then, how would I use culling with a vertex array? Right now I'm just doing this before I draw each triangle (I know it gets confusing with all the []->. =P):
vector3d point;point.x = model->triangle[ii]->vertex[0]->point.x;point.y = model->triangle[ii]->vertex[0]->point.y;point.z = model->triangle[ii]->vertex[0]->point.z;double a = model->triangle[ii]->facenormal.x;double b = model->triangle[ii]->facenormal.y;double c = model->triangle[ii]->facenormal.z;double d = -Dot( model->triangle[ii]->facenormal, point );if (Camera->pos.x*a + Camera->pos.y*b + Camera->pos.z*c + d > 0){	//draw

But if I used vertex arrays then I couldn't do this before each triangle.

Share on other sites
OpenGL will automatically perform frustum culling on a per-triangle basis which means that doing it yourself on a per-triangle basis is slow and redundant. The objective of manual frustum culling is to see if an entire object lies outside your view frustum. Then you don't have to push the object to the GPU which means that OpenGL doesn't either perform the automatic culling or even display the object.

So you have to decide how to represent objects. A common way is to use a center point for each model and a radius because testing spheres against the frustum planes is fast.

Share on other sites
Thanks for the quick response.

Quote:
 Original post by CrimsonSunOpenGL will automatically perform frustum culling on a per-triangle basis which means that doing it yourself on a per-triangle basis is slow and redundant. The objective of manual frustum culling is to see if an entire object lies outside your view frustum. Then you don't have to push the object to the GPU which means that OpenGL doesn't either perform the automatic culling or even display the object.

My code was for backface culling. It actually halved the CPU usage, so is backface culling by default off? If so how do I turn it on?

Quote:
 Original post by CrimsonSunSo you have to decide how to represent objects. A common way is to use a center point for each model and a radius because testing spheres against the frustum planes is fast.

A center point and sphere sounds easy because I already generate the bounding box when loading models. And to test it... you'd just need to determine whether a point from the center of the sphere, outwards the radius's distance, and in the direction of the opposite of the plane's normal, is in front/behind the plane.

Share on other sites
Quote:
Original post by Gumgo
Thanks for the quick response.

Quote:
 Original post by CrimsonSunOpenGL will automatically perform frustum culling on a per-triangle basis which means that doing it yourself on a per-triangle basis is slow and redundant. The objective of manual frustum culling is to see if an entire object lies outside your view frustum. Then you don't have to push the object to the GPU which means that OpenGL doesn't either perform the automatic culling or even display the object.

My code was for backface culling. It actually halved the CPU usage, so is backface culling by default off? If so how do I turn it on?

Quote:
 Original post by CrimsonSunSo you have to decide how to represent objects. A common way is to use a center point for each model and a radius because testing spheres against the frustum planes is fast.

A center point and sphere sounds easy because I already generate the bounding box when loading models. And to test it... you'd just need to determine whether a point from the center of the sphere, outwards the radius's distance, and in the direction of the opposite of the plane's normal, is in front/behind the plane.

I'm sorry, I thought you were talking about frustum culling. Backface culling in OpenGL is off by default, but you can turn it on with
glEnable(GL_CULL_FACE);
and one of these two (which will designate which face to cull based on current face winding):
glCullFace(GL_FRONT);glCullFace(GL_BACK);

Share on other sites
Thanks! GL_CULL_FACE and GL_FRONT I believe will solve all my problems.

Share on other sites
Hm... the code I posted earlier (for "manuel" backface culling) halves the CPU usage but the GL_CULL_FACE doesn't seem to make any difference... Any idea why (I haven't added VAs yet or VBOs)?

Share on other sites
Correct me if I'm wrong, but the backface culling that was just posted works on the GPU, not on the CPU. Which would explain why you're not seeing any differences in CPU usage. Since culling the triangles happens after you pass them to the driver, you've already gone through the 'bottleneck' of passing them through.

Share on other sites
Quote:
 Original post by EzbezCorrect me if I'm wrong, but the backface culling that was just posted works on the GPU, not on the CPU. Which would explain why you're not seeing any differences in CPU usage. Since culling the triangles happens after you pass them to the driver, you've already gone through the 'bottleneck' of passing them through.

Sounds like if I use VBOs then it will make it much better.

Share on other sites

This topic is 3847 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

Create an account

Register a new account

• Forum Statistics

• Total Topics
628735
• Total Posts
2984449

• 25
• 11
• 10
• 16
• 14