Archived

This topic is now archived and is closed to further replies.

Making stuff go fast

This topic is 6269 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, I am in the middle of creating a graphics system that''s being used as the core engine for a piece of demo technology for a medium size company. Without being able to say too much about what''s going on, I am attempting to display about 25 meshes whose triangle count totals about 100,000. These meshes are for the most part, visible in their entirety (backfaces aside). At present I''m getting 7.5fps on an Athlon 700, 256Mb ram, GeForce 256 DDR, Win 2k. All triangles are drawn lit and z-buffered, with no texture mapping. Although I don''t expect order of magnitude improvements, I would like to make it go faster. I''m using OpenGL, and rendering the triangles through a glDrawElements call. The geometry isn''t static, so display lists is a no go. The bottleneck is certainly the rendering, commenting out the glDrawElements calls bumps us up to 28fps, which is reasonable considering some of the CPU intensive stuff we''re doing (although optimisation when we get to it could easily double that). If I could outline some of my ideas, and get some feedback as to which way to go, this would be handy (especially if any NVidia engineers are reading!). 1. Do backface culling on the CPU. Tried it already, seemed to shift the bottleneck onto the vertex buffer creation. Halved the polygon count though. 2. Break up the vertex arrays into more glDrawElements calls, but with about 100 triangles in each one. At the moment we get up to about 6000 vertices and 9000 triangles with one mesh, and I''m not sure sending all that data down to the card at once is a good idea. NVidia''s site is a bit vague as to what a near-optimal vertex array size is. 3. Persuade the company''s art department to get rid of some triangles, which wouldn''t actually hurt visual quality. However, since we''re contracted to them, I''d rather have a can do attitude. Plus these are people who are used to scenes taking an afternoon to render 4. Put a lot of effort into strip finding. It''s not an ideal way to do it, since there''s quite a lot of work involved maintaining strip lengths, and due to the way some of our modellers work it would be non-trvial to put in at this point. Anyway, any suggestions would be very gratefully recieved. Thanks a lot, Henry

Share this post


Link to post
Share on other sites
hi ginko

Im was just thinking , do any of the meshes get quite far away from the view? If so you could have a Low Detail version of the mesh that you could use when it gets far away.Its a shame to render 10000 triangles for example if its only going to take up 10 pixels on the screen.

Just a thought.


Regards

Gaz

Share this post


Link to post
Share on other sites
Hi,

Thanks for the replies.

An LOD solution would be a pretty good thing. But at some points all these meshes will be close enough to the camera (assuming that as a metric) to assume high detail levels. Frame rate drops accordingly, and we don''t have enough FPS for that to not matter.

Triangle strips may be the way to go, I''ll run some simple tests. I''m concerned in general about optimizing the vertex arrays themselves, in both size and cache coherency.

The NV_VERTEX_ARRAY_RANGE extension, and the associated fence extension will be useful to put in, but first I''d like to achieve frame rates closeish to some other demos I''ve seen. (In particular, the demo of the NV_VERTEX_ARRAY_RANGE extension where the extension is turned off yields performance roughly three times what we get, in similarish circumstances - building the vertex array each frame, using glDrawElements, similar triangle count. The difference is the use of triangle strips, and breaking the large calls down into a few glDrawElements calls.) Obviously comparison to demos is a silly thing to do, but it gives me something to aim at.

Any more suggestions, particularly optimal vertex array sizes, will be gratefully received.

Cheers,

Henry

Share this post


Link to post
Share on other sites
On FlipCode they were discussing this a while back and apparently nVidia''s site stated that the optimal VB size was 2K. But then a guy from nVidia posted and said that it is definitely more optimal to use one large VB than to make multiple rendering calls. Ideally, you would have one VB for each different set of renderstates (since that''s the minimum number of rendering calls you could make).
FYI =)


Zeus Interactive

Share this post


Link to post
Share on other sites