Archived

This topic is now archived and is closed to further replies.

Thor82

slow with 1000 polygons???

Recommended Posts

Hi all, i made a simple particle system, (yes,the same of the topic of some time ago) but i start losing fps with ~500 particles that sounds to me really strange with a gf4ti4200 ... i put here the "drawing" code...

void clDrawer::Draw(clParticleSys& P)
    int i = 0;


    if(P.Particle_Count<1)
		return;
	
	
	glMatrixMode(GL_MODELVIEW);							
	glPushMatrix();											
	glTranslatef(P.Position.x,P.Position.y,P.Position.z);
	for(i=0;i<P.Particle_Drawn;++i)
	  if((P.ParticleList[i]!= NULL) && (P.ParticleList[i]->Active==true)){
	    glPushMatrix();		  

	    glTranslatef(P.ParticleList[i]->Position.x,P.ParticleList[i]->Position.y,P.ParticleList[i]->Position.z);
	    glColor4f(P.ParticleList[i]->Color.x,P.ParticleList[i]->Color.y,
		        P.ParticleList[i]->Color.z, (P.MaxAge - P.ParticleList[i]->Age)/P.MaxAge);
	    glCallList(P.DisplayList);		
	    glPopMatrix();
	  };

    
	glPopMatrix();				
};

 
the profiler says that most of the time is lost in this section of the program...i don't know how to optimize everything =( EDIT: i forgot: i create a displ.list ONCE at load time with a trianglestrip and every time i call that one
There aren't problems that can't be solved with a gun...

Share this post


Link to post
Share on other sites
Well, particles do tend to cause a slow down, but im guessing yours are causing a significant one. It might be all the glTranslate-ing, I dunno.

Are you using billboarding? Coz if u arent using it would probably mean you would need less particles to get the same effect.

I dont know if it will help but heres the particle drawing code from my engine...

 
float viewMatrix[16];
glGetFloatv(GL_MODELVIEW_MATRIX, viewMatrix);

CVector3 right(viewMatrix[0], viewMatrix[4], viewMatrix[8]);
CVector3 up(viewMatrix[1], viewMatrix[5], viewMatrix[9]);
CVector3 v;
glBegin(GL_QUADS);
for (i = 0; i < numParticles; i++)
{
pos = particles[i].pos;
size = particles[i].radius;

glColor3fv(particles[i].colour);
v = (pos + (right + up) * -size);
glTexCoord2f(0.0, 0.0);
glVertex3f(v.x, v.y, v.z);

v = (pos + (right - up) * size);
glTexCoord2f(1.0, 0.0);
glVertex3f(v.x, v.y, v.z);

v = (pos + (right + up) * size);
glTexCoord2f(1.0, 1.0);
glVertex3f(v.x, v.y, v.z);

v = (pos + (up - right) * size);
glTexCoord2f(0.0f, 1.0f);
glVertex3f(v.x, v.y, v.z);
}
glEnd();
glDisable(GL_BLEND);
glDepthMask(GL_TRUE);


Lukerd.

P.S Thankyou for trying my engine.

Hyperdev

"To err is human, to really mess up requires a computer"


[edited by - Lukerd on July 4, 2003 7:48:50 PM]

Share this post


Link to post
Share on other sites
1000 should be about 2000 polygons per frame which should not be a problem at all.

But if you are doing blending with all of them, it might be a cause.

But considering you are using display lists and a GF4, i dont noe what it could be, perhaps u could put a screenshot of what ur particles look like.

Share this post


Link to post
Share on other sites
Yo !

You can create a small (2000 items) array of struct { float x,y,z; float u,v; }. Then, insted of calling glTexCoord, glVertex or glCallList you can FILL that array with data (counting how many vertices you have written). When the vertices count reaches the array size you can FLUSH whole array using glDrawArray or glDrawElements. It will be a LOT faster.

Share this post


Link to post
Share on other sites
Why do you use a display list? Do you use it to draw your particles?

I actually think this display list draws a single particle.
I point my finger on CPU bottlenecking. Beginning with the fact you are using an immediate mode func to supply the particle color. This means that CPU have to process 1k calls at each frame. The thing gets a bit worse when you also call the display list.

To end up with a glorious amount of compute time, you have a division (../P.MaxAge...) which is probably done in float precision. Try to see if you can get a speedup by storing 1/MaxAge instead and multiplying. If you get more frames you're CPU limited.

BTW, giving more infos would be very nice. If you get the same performance at 640x480x16 and 1600x1200x32 then you're not fill limited.
If you edit the display list so it contains twice the polygons and you still get the same frames, you're not transform limited.

If you pull out the Color call, do you get a performance increase? This is a bit of CPU time spent.
Suggested read: GDC2003 - “Batch, Batch, Batch:”
What Does It Really Mean? (directX oriented but also applies to GL)

Now, what you are doing is a division, a translate, a color, a callList... test your app performance and examine the results.

EDIT: at least, tell us what CPU are you running!

[edited by - Krohm on July 5, 2003 6:11:18 AM]

Share this post


Link to post
Share on other sites
duh... i'm really dumb i discovered my problem...

all my code over there is correct...
the problem was on the creation of the display list...

i uploaded the bmp inside the display list...
so every time i called it it uploaded the bmp...


EDIT: Answer to Khrom... no, the problem is that i'm stupid...


There aren''t problems that can''t be solved with a gun...

[edited by - Thor82 on July 5, 2003 6:33:20 AM]

Share this post


Link to post
Share on other sites