how to improve performance

Started by
24 comments, last by RPTD 18 years, 7 months ago
i m working on molecular modeling application. i have generate some views based on cylinders and spheres. i have drawn around 2000 atoms. i have used spheres for atoms and cylinders for bonds. but the problem is that the rendering is very slow and transfromation is also performing slow. tell me what should i do to improve the performance.
Advertisement
Display lists in combination with frustum culling should do the trick

You could also cache the transformation and load it with glLoadMatrix
What the previous poster mentioned about display lists and culling is true, but I found several other important things:
- Store your atoms and bonds in a linear data-structure (array or list) and draw them with a for loop (as opposed to using a hierarchical structure like a scene graph, which is horrible for this kind of thing).
- Try reducing the vertex count for atoms and bonds: use a triangle-stripped 20 triangle sphere for atoms and boxes instead of cylinders for bonds. I deliberately said vertex count instead of triangle count. Rendering performance in this case is about the amount of glVertex/glNormal calls per display list.
- Prevent state changes between atoms: making some assumptions about your coloring here) set colors to white, draw all bonds, set colors to green draw all carbon atoms, set colors to red draw all oxygen, etc.
- And I'm not sure about the glLoadMatrix thing. Your atoms probably need only a translation (no use rotating a sphere around its axis) and I doubt a single glLoadMatrix is faster than a single glTranslatef.

Tom
can you tell me why a hierarchical structure shall be horrible forthis kind of thing? if the dimensions of the frustrum and maybe an octree are choose properly you could cull most of it away

without doing the frustrum check for each object


use display lists as already suggested and maybe store lists of different colored molecules in the leafs of an octree

also if possible you could render the near leafs of the octree first and get rid of some pixel rasterizations via depthbuffer
although i doubt this will give you a big performance boost as long as you don t use any complex shader operations
you could store the transformations in displaylists and the drawing calls for a spheres and cylinders seperately
http://www.8ung.at/basiror/theironcross.html
Good tips. I add these.

First, 2000 atoms each rendered as a sphere is not impossible that the animation is too 'slow' (of course it depends mainly from your card).
However you can try to speed up something


  • You use spheres. How many triangles per sphere? You can reduce the number of stacks/slices until you can achieve a good result. Uniform level of detail (LOD) reduction


  • Put the geometry on display list. In this case you can create display list of single spheres (1) and then of the entire geometry (nested disp list).
    Using immediate mode ( glBegin(), glVertex(), glNormal(), ) is too slow because you send each float to the implementation that cannot cache data.


  • You can use level of detail (LOD) reduction. Create different display lists each with a sphere/cylinder at different resolution. Then you should use an algorithm to choose the right resolution based on the error you introduce (a real sphere should have an infinite number of triangles but probably few triangles are imperfect but look nice.
    For simplicity: if the sphere is far use the display list with less triangles or (viceversa) use the disp list with more triangles if you are near it.



I dont know your code so I cannot suggest other optimizations
Ok, perhaps my first tip was a bit too strong. Using a structure like an octtree will be benificial if you just display static molecules. In my case I actually used an octtree structure on my application (a virus containing ~300000 atoms, 3 FPS on a Geforce 6), but it didn't really work because all my particles are moving (which means massive updates to the octtree each frame). I found that culling each atom separately performed better and was also much easier to implement. Anyway, my octtree implementation may have been severly flawed, but I never looked into that because the simple method performed good enough.

What I really meant to say with that first point was that you shouldn't use a naive scene graph, where a 'molecule' node has all the atoms and bonds as children.

Tom

Quote:Original post by Basiror
can you tell me why a hierarchical structure shall be horrible forthis kind of thing? if the dimensions of the frustrum and maybe an octree are choose properly you could cull most of it away

without doing the frustrum check for each object


Octree are slow for its nature. It seems strange but it is [smile].
And the culling is rare in this scenario because a molecular model is viewed from a distance that allows you to see the entire model (or a big part of it).
In this case the culling is too expensive for the benefits it give. In practice the risk is to iterate on the octree nodes to produce the same set.

About octrees: diffferent is the case in which you drive a space ship in a solar system [smile]

Another tip to soofian: you can use a very low LOD (or wireframe) when you move the model then switch to high LOD when there is no need to animate it. The effect is a pop and it's not nice to see but it could be an efficient tradeoff.



If quality is an issue (you want the near spheres to look 'spherical' or normals are used to shade them nicely) you can do whats called a LOD (Level of Detail) mechanism to shift the number of polygons on your spheres (and cyclindars) depending on the distance from the viewer. Objects in the forground would be drawn with maximum polygon count and those further away with fewer and fewer.
You would sort the objects by distance (manybe 3 sets) and draw each set with different meshes for the shapes. Beyond a certain distance you might only draw points (and no bond cylindars) as no detail would be rendered in that sets case.
how about textured billboards as a sphere replacement?
http://www.8ung.at/basiror/theironcross.html
Quote:Original post by dimebolt
Using a structure like an octtree will be benificial if you just display static molecules. In my case I actually used an octtree structure on my application (a virus containing ~300000 atoms, 3 FPS on a Geforce 6), but it didn't really work because all my particles are moving (which means massive updates to the octtree each frame). I found that culling each atom separately performed better and was also much easier to implement.


Just FYI, you would have been much better served by a dynamic AABB tree.

[Edited by - Promit on August 23, 2005 12:02:07 PM]
SlimDX | Ventspace Blog | Twitter | Diverse teams make better games. I am currently hiring capable C++ engine developers in Baltimore, MD.

This topic is closed to new replies.

Advertisement