Jump to content

View more

Image of the Day

Adding some finishing touches...
Follow us for more
#screenshotsaturday #indiedev... by #MakeGoodGames https://t.co/Otbwywbm3a
IOTD | Top Screenshots

The latest, straight to your Inbox.

Subscribe to GameDev.net Direct to receive the latest updates and exclusive content.

Sign up now

Instancing a good idea for speed in this case?

4: Adsense

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
6 replies to this topic

#1 Oni   Members   


Posted 26 April 2011 - 04:23 PM

Hi guys. Im trying to render a pinboard over two heads. Ive decided to render both the rear and front views to a 2048 x 768 window which I'll split over the two screens. The problem Im having is that, according to gDebugger, I'm moving around 1815000 triangles! Im only on a little GeForce 330M on OSX so Im guessing some optimisation is needed here.
Posted Image

I've run this through OpenGL Profiler and indeed, the calls to glDrawArrays (as I remember) take up the majority of the time.

Each pin is loaded into a VBO and then called. There are 60 x 49 pins in that image, each one has 180 faces (meshlab doesnt tell me exact triangles) but adding it up and it is pretty close to the figure given by gDebugger.

In addition to drawing the colour step, there is also a step for linear depth (in order to setup some SSAO). At the moment, im getting around 12-15fps. I'd like to get it to 30.

I thought about trying for a non linear depth buffer and reading the depth buffer and colour buffer from the FBO in one go to save a pass but commenting out the depth pass for now seemed to make little difference (oddly).

I tried 'pseudo instancing' (i think) by passing in the transformation matrix as a texture to my vertex shader. This didnt give that much in the way of speedup.

As OSX has limited support (annoyingly) the only method I can see to get more speed is to use GL_ARB_instanced_arrays somehow but Im not exactly sure if this will help or improve things. There may be something else I can do to get things a little faster but I'm not sure what. I've gotten the triangles per pin down about as far as I can but I'm not sure what else is best. Any thoughts chaps? Cheers
If its your first night, you have to fight!

#2 Flimflam   Members   


Posted 26 April 2011 - 09:30 PM

This doesn't really answer your question but is there a reason why each individual pin is ~180 faces? That's a lot for something so primitive! You should be able to get something like that down quite substantially while retaining a decent amount of quality and that would reduce the load by an a large sum.

#3 Danny02   Members   


Posted 27 April 2011 - 03:15 AM

Don't load each pin in its own BO, arn't all the pins the excat same model?.
As long all the pins are more or less static there is no reason for instancing so just copy all pins in a single BO so only 1 draw call is needed

#4 Ingenu   Members   


Posted 27 April 2011 - 05:40 AM

Basic things to try :
-1 STATIC VBO for 1 pin, draw with different transformation 2920 times
-1 STATIC VBO for 2920 pins, draw once
-1 STATIC VBO for 1 pin, instance stream of 2920 matrix4x3. (alternatively only a position if you know you'll never need to rotate them.)

If each face is 2 triangles it's already > 1M triangles drawn.
Given your geometry layout and point of view, you are hitting a pathological bad case for GPU. (Long thin triangles that contribute to few samples/pixels)
-* So many things to do, so little time to spend. *-

#5 Oni   Members   


Posted 28 April 2011 - 07:07 AM

Thanks for the input guys!

So far im trying option 1. Its 1 VBO being called X number of times. I should have said that :S

I shall try the other two and see what we get.
I also agree that there is indeed, a bad cae for the GPU here as we are indeed, getting to the point where the triangles arent really adding much to the scene. It may be time to rethink the approach, possibly with some kind of sprite or imposter. I still think though, we should be able to get more output from this I'd have thought.

Well, There is a possibility I could reduce the poly count but it doesnt look great:


This is with 100 faces as oppose to 180. You can begin to see the polygon outlines which is not nice. Also, you can see i've reduced the overall number of pins. This double view runs at 30fps. With the original number of pins, 180 faces vs 100 faces makes almost no difference in speed. They both go at around 10fps. Its almost as if there is a cutoff point, beyond which you get no speed up or change at all.
If its your first night, you have to fight!

#6 Hodgman   Moderators   


Posted 28 April 2011 - 07:30 AM

Do the pins animate at all, or are their positions static?
If there's no animation, I'd try Danny02's suggestion / Ingenu's #2 option.

#7 V-man   Members   


Posted 28 April 2011 - 07:59 AM

Also, get rid of glDrawArrays and use either glDrawElements or glDrawRangeElements. I don't know people use glDrawArrays. glDrawArrays is fine to unconnected polygons or GL_POINTS.
Sig: http://glhlib.sourceforge.net
an open source GLU replacement library. Much more modern than GLU.
float matrix[16], inverse_matrix[16];
glhTranslatef2(matrix, 0.0, 0.0, 5.0);
glhRotateAboutXf2(matrix, angleInRadians);
glhScalef2(matrix, 1.0, 1.0, -1.0);
glhQuickInvertMatrixf2(matrix, inverse_matrix);
glUniformMatrix4fv(uniformLocation1, 1, FALSE, matrix);
glUniformMatrix4fv(uniformLocation2, 1, FALSE, inverse_matrix);

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.