Jump to content
  • Advertisement
Sign in to follow this  

OpenGL Multiples VBO for large CAD models

This topic is 2777 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts


I have programmed a working CAD vizualisation engine for my work. The goal is to load very large models and display them at interactive rates.
The models have the following traits:
   - amount of vertices ranges from small (150 vertices) to very large (16 million or more). The 2 to 4 millions vertices are very common
   - large amount of different models (in the thousands range, until 100000...)

The engine I have made so far has the following characteritics:
   - standard culling possibility (Kd-Tree + Frustum + pixel culling)
   - unified vertex format to limit state change
        28 bytes per vertex : 2 floats for position, 3 floats for normals, 4 bytes for colour RGBA.
     More vertex format are supported, but actual scenario are using this one
   - state caching already implemented (no call to glBindBuffer if the buffer to bind is already the current one)
   - done in OpenGL & C++
   - Fixed Function Pipeline or Shader + VBO + IBO are used
   - Mesh optimisation : all duplicates vertices are removed, and each mesh is drawn using 1 single call to glDrawElements
   - each Mesh has only 1 VBO and only 1 IBO (both static)
   - geometry is mostly static

Now on NVidia Quadro cards (1 Gb VRAM) I have:
   - the theoretical limit for vertex buffer size is 1 million vertices (result from glGetIntegerv(GL_MAX_ELEMENTS_VERTICES, &iMaxVertex))
   - some VBOs are already over this limit, but it seems to be fine (some are 16 times this limit). I typically get a 20 to 60 FPS with more than 5 million vertices
   - I have already filled up 1 GB with VBOs and IBOs on the graphic card (yes, the models are THAT big).

The problems I face are :
   - When the amount of models increase, the FPS drops to 10 => probably due to too many VBO & IBO switching
   - the worst problem : after loading many models OR a few very large models, the FPS behaves strangely : it FREEZES for 3 seconds, and then shot back up to 15 or more FPS. And 15 seconds later, it freezes again. So in summary, the problem is "it is sometimes slow, sometimes fast".

Now my 2 questions are:
  Question 1 : what can be the cause of the freezing of the FPS ? It does not happen on ATI cards Radeon 6550M HG

  Question 2 : for optimization, I have a dilemma. I can either:
                a pack everything in 1 VBO and 1 IBO => I will save on glBindBuffer calls, but I will have 2 monstrous buffers on the GPU, clearly over the maximal size returned by glGetIntegerv(GL_MAX_ELEMENTS_VERTICES, &iMaxVertex)
                b pack all data in several VBOs and IBOs, with each VBO and IBO having 95% of the max size returned by glGetInteger. I will have more state switching than with the 1st option, but less than currently.               Since both options are rather heavy to implement, I would like to know if some among you have experience with this

situation. I am more leaning on option b, but it is no so convenient to split big meshes.

Thanks in advance for the help !


Share this post

Link to post
Share on other sites
You say you've implemented KD tree, but what are the constraints on your view? I mean give us a bit of a clue about the use case for your engine.

Share this post

Link to post
Share on other sites
OK, so here is one use case (where freezing occurs):
- largest model : 16 051 900 vertices, 17 081 685 triangles (drawn with GL_TRIANGLES => 51 245 055 indices)
=> largest VBO : 428 Mo (28 bytes per vertex)
largest IBO : 195 Mo (4 bytes per indice)
You can see that the model is optimized : there are more triangles than vertices => most of vertices are shared.

- 410 models loaded (410 VBOs + 410 IBOs) : see gDEBugger GL extract attached for a list. You will see there are very big models, and small ones
- 1 Gb memory used...

For the engine, here is how it works:
- draw loop :
For each frame
Compute frustum planes
Do Frustum Culling (with KdTree), using bounding sphere tests against plane (very fast)
Do Pixel culling (with Bounding sphere) => 1 distance calculation (ouch : sqrt), 2 multiplications, 1 division, 1 comparison
Draw the remaining meshes : draw opaques, draw transparents, draw lines

- For frustum culling : I start with the KdTree root node, and recursively traverse the tree until a leaf node is found in the viewing frustum. I use plane / sphere distance calculation.

- For KdTree generation : I use the Surface Area Heuristic. It is done on model level (the meshes are not splitted)

- performance : at first, my models were composed of 1 or more VBOs. I have quickly seen it was leading to a performance bottleneck => I solved the problem by merging all VBOs into 1, removing duplicated & ununsed vertices. So I know for sure that too many VBOs are provoking a performance bottleneck

- pixel culling: it is clearly a performance boost, expecially when many small models are loaded (screws, nuts, ...)
For pixel culling, the minimal size is different if the camera is moving or if it is static.

- multipass rendering : after culling, I draw in this order: opaques triangles, transparent triangles, edges of the meshes. When moving, edges are not drawned.

I will try first anyway to pack everything in 1 VBO and 1 IBO : it is the simplest to implement, and it might just work.

Share this post

Link to post
Share on other sites
Sign in to follow this  

  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!