Sign in to follow this  

Rendering and CPU usage

This topic is 3847 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm using OpenGL for a renderer and it seems that a LOT of CPU is eaten up with objects that don't have that many triangles. About 5% is used for an object with 100 triangles and... 100% for a 1600 triangle object! I'm using this rendering code for each object (at 50fps): (the "Zone3D" object loops through all the objects in the "zone" and renders them)
void Zone3DClass::Draw()
{
	std::list<RModelClass*>::iterator iter1 = render_models.begin(); // models
	std::list<TextureClass*>::iterator iter2 = texture.begin(); // textures for each model
	std::list<TextureClass*>::iterator iter3 = alphamask.begin(); // alpha mask (currently doesn't do anything)
	std::list<double>::iterator iter4 = posX.begin(); // x for each obj
	std::list<double>::iterator iter5 = posY.begin(); // y for each obj
	std::list<double>::iterator iter6 = posZ.begin(); // z for each obj
	std::list<double>::iterator iter7 = alpha.begin(); // alpha (not works yet)

	for (unsigned int ii = 0; ii < rmodels; ii++)
	{
		RModelClass * model = *(iter1);
		TextureClass * tex = *(iter2);
		TextureClass * amask = *(iter3);
		double pX = *(iter4);
		double pY = *(iter5);
		double pZ = *(iter6);
		double al = *(iter7);

		glLoadIdentity();
		glTranslatef( (GLfloat)pX, (GLfloat)pY, (GLfloat)pZ );

		glEnable( GL_REPEAT );
		glColor4f( 1.0f, 1.0f, 1.0f, 1.0f );
		glEnable( GL_TEXTURE_2D );

		tex->SetTexture();
		glBegin( GL_TRIANGLES );
		for (unsigned int ii = 0; ii < model->triNum; ii++)
		{
			for (unsigned int jj = 0; jj < 3; jj++)
			{
				glNormal3d( model->triangle[ii]->normal[jj]->vector.x, model->triangle[ii]->normal[jj]->vector.y, model->triangle[ii]->normal[jj]->vector.z );
				glTexCoord2d( model->triangle[ii]->vertex[jj]->tex.x, model->triangle[ii]->vertex[jj]->tex.y );
				glVertex3d( model->triangle[ii]->vertex[jj]->point.x, model->triangle[ii]->vertex[jj]->point.y, model->triangle[ii]->vertex[jj]->point.z );
			}
		}
		glEnd();

		iter1++;
		iter2++;
		iter3++;
		iter4++;
		iter5++;
		iter6++;
		iter7++;
	}
}

Is there anything that looks bad in there? I don't know why it would slow down so much.

Share this post


Link to post
Share on other sites
There are other things that can be changed, but the first and biggest change that can be made is to make a transition from specifying individual vertices to storing these vertices in vertex buffers. Nehe's 45th tutorial covers the topic. This is *THE* way to render figures with large vertex counts [1600 isn't that large, but certainly large enough to benefit from this method]. This change will reduce the CPU load specificly by storing vertices on the graphics card instead of having to constantly re-construct the vertex arrays. This method is used for static geometry [stuff that doesn't animate] but can be used for animated geometry with a few higher-level tricks [namely use of vertex shaders].

The part of your code that will be replaced by this method is this :
		glBegin(GL_TRIANGLES);
for (unsigned int ii = 0; ii < model->triNum; ii++)
{
for (unsigned int jj = 0; jj < 3; jj++)
{
glNormal3d( model->triangle[ii]->normal[jj]->vector.x,
model->triangle[ii]->normal[jj]->vector.y,
model->triangle[ii]->normal[jj]->vector.z );
glTexCoord2d( model->triangle[ii]->vertex[jj]->tex.x,
model->triangle[ii]->vertex[jj]->tex.y );
glVertex3d( model->triangle[ii]->vertex[jj]->point.x,
model->triangle[ii]->vertex[jj]->point.y,
model->triangle[ii]->vertex[jj]->point.z );
}
}
glEnd();

Share this post


Link to post
Share on other sites
Thanks for the tip, I'll start with doing that. But what about animated objects though? The 1600-triangle object was actually a character. And on top of rendering I'll have collision detection which will take some time.

On a side note, I haven't added in anything like culling yet, so that may explain a little bit.

EDIT: Culling cuts the CPU down to 50% (with the 1.6k tri model).

[Edited by - Gumgo on June 3, 2007 7:48:56 PM]

Share this post


Link to post
Share on other sites
You might want to to consider switching to VBOs or at least VAs, as you seem to burn a lot of cycles with all the glVertex, glTexCoord, and glNormal function calls. Reducing the number of API calls will most likely lower the CPU usage of your render loop a lot.

Share this post


Link to post
Share on other sites
Thanks, I'm following the tutorial on NeHe to switch to VBO (for static geometry, or at least I will be when I finish all this darn homework). I don't think I've heard of VAs though. What are those?

EDIT: Wait, nevermind. I remembered what they were.

Share this post


Link to post
Share on other sites
Culling, good idea in ALL situations.

VBO = vertex buffer object.
VA = vertex array.

VA's can be thought of as VBO's that are kept in system memory and are not seen by open GL until you actually do some drawing. This means they are decent to use with animated meshes and such, but are still no where near as efficient as using VBO's [ESPECIALLY for geometry that can be used in static storage]. VA's don't get you that much in terms of efficiency though, over using the old primitives like glVertex and glNormal. If you can figure out a way to make your geometry expressable in static storage, you are always better off.

Animating things aren't always so easy to express as static storage though, obviously, so VA's might be a decent option, but you're still usually best off using VBO's. To use VBO's for drawing animated things, consider using GL_STREAM_DRAW instead of GL_STATIC_DRAW when working with the buffer. This will flag the memory as being used once, then replaced [which is what you want with animation]. Keep in mind that a VBO that uses GL_STREAM_DRAW will run [sometimes much] slower than one that uses GL_STATIC_DRAW, so you should still use static draw when ever you can.

Considering what you're working with at the moment, it is likely not a good idea to jump straight into shaders just yet, but for the sake of completeness, shaders are the way that you would be able to express an animating figure through static storage. [just to keep in mind for future learning]

Share this post


Link to post
Share on other sites
Hmm... so then, how would I use culling with a vertex array? Right now I'm just doing this before I draw each triangle (I know it gets confusing with all the []->. =P):

vector3d point;
point.x = model->triangle[ii]->vertex[0]->point.x;
point.y = model->triangle[ii]->vertex[0]->point.y;
point.z = model->triangle[ii]->vertex[0]->point.z;

double a = model->triangle[ii]->facenormal.x;
double b = model->triangle[ii]->facenormal.y;
double c = model->triangle[ii]->facenormal.z;
double d = -Dot( model->triangle[ii]->facenormal, point );
if (Camera->pos.x*a + Camera->pos.y*b + Camera->pos.z*c + d > 0)
{
//draw


But if I used vertex arrays then I couldn't do this before each triangle.

Share this post


Link to post
Share on other sites
OpenGL will automatically perform frustum culling on a per-triangle basis which means that doing it yourself on a per-triangle basis is slow and redundant. The objective of manual frustum culling is to see if an entire object lies outside your view frustum. Then you don't have to push the object to the GPU which means that OpenGL doesn't either perform the automatic culling or even display the object.

So you have to decide how to represent objects. A common way is to use a center point for each model and a radius because testing spheres against the frustum planes is fast.

Share this post


Link to post
Share on other sites
Thanks for the quick response.

Quote:
Original post by CrimsonSun
OpenGL will automatically perform frustum culling on a per-triangle basis which means that doing it yourself on a per-triangle basis is slow and redundant. The objective of manual frustum culling is to see if an entire object lies outside your view frustum. Then you don't have to push the object to the GPU which means that OpenGL doesn't either perform the automatic culling or even display the object.


My code was for backface culling. It actually halved the CPU usage, so is backface culling by default off? If so how do I turn it on?

Quote:
Original post by CrimsonSun
So you have to decide how to represent objects. A common way is to use a center point for each model and a radius because testing spheres against the frustum planes is fast.


A center point and sphere sounds easy because I already generate the bounding box when loading models. And to test it... you'd just need to determine whether a point from the center of the sphere, outwards the radius's distance, and in the direction of the opposite of the plane's normal, is in front/behind the plane.

Share this post


Link to post
Share on other sites
Quote:
Original post by Gumgo
Thanks for the quick response.

Quote:
Original post by CrimsonSun
OpenGL will automatically perform frustum culling on a per-triangle basis which means that doing it yourself on a per-triangle basis is slow and redundant. The objective of manual frustum culling is to see if an entire object lies outside your view frustum. Then you don't have to push the object to the GPU which means that OpenGL doesn't either perform the automatic culling or even display the object.


My code was for backface culling. It actually halved the CPU usage, so is backface culling by default off? If so how do I turn it on?

Quote:
Original post by CrimsonSun
So you have to decide how to represent objects. A common way is to use a center point for each model and a radius because testing spheres against the frustum planes is fast.


A center point and sphere sounds easy because I already generate the bounding box when loading models. And to test it... you'd just need to determine whether a point from the center of the sphere, outwards the radius's distance, and in the direction of the opposite of the plane's normal, is in front/behind the plane.


I'm sorry, I thought you were talking about frustum culling. Backface culling in OpenGL is off by default, but you can turn it on with
glEnable(GL_CULL_FACE);
and one of these two (which will designate which face to cull based on current face winding):
glCullFace(GL_FRONT);
glCullFace(GL_BACK);

Share this post


Link to post
Share on other sites
Hm... the code I posted earlier (for "manuel" backface culling) halves the CPU usage but the GL_CULL_FACE doesn't seem to make any difference... Any idea why (I haven't added VAs yet or VBOs)?

Share this post


Link to post
Share on other sites
Correct me if I'm wrong, but the backface culling that was just posted works on the GPU, not on the CPU. Which would explain why you're not seeing any differences in CPU usage. Since culling the triangles happens after you pass them to the driver, you've already gone through the 'bottleneck' of passing them through.

Share this post


Link to post
Share on other sites
Quote:
Original post by Ezbez
Correct me if I'm wrong, but the backface culling that was just posted works on the GPU, not on the CPU. Which would explain why you're not seeing any differences in CPU usage. Since culling the triangles happens after you pass them to the driver, you've already gone through the 'bottleneck' of passing them through.


Sounds like if I use VBOs then it will make it much better.

Share this post


Link to post
Share on other sites

This topic is 3847 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this