Sign in to follow this  

OpenGL Multiples VBO for large CAD models

Recommended Posts

seb_seb0    100

I have programmed a working CAD vizualisation engine for my work. The goal is to load very large models and display them at interactive rates.
The models have the following traits:
   - amount of vertices ranges from small (150 vertices) to very large (16 million or more). The 2 to 4 millions vertices are very common
   - large amount of different models (in the thousands range, until 100000...)

The engine I have made so far has the following characteritics:
   - standard culling possibility (Kd-Tree + Frustum + pixel culling)
   - unified vertex format to limit state change
        28 bytes per vertex : 2 floats for position, 3 floats for normals, 4 bytes for colour RGBA.
     More vertex format are supported, but actual scenario are using this one
   - state caching already implemented (no call to glBindBuffer if the buffer to bind is already the current one)
   - done in OpenGL & C++
   - Fixed Function Pipeline or Shader + VBO + IBO are used
   - Mesh optimisation : all duplicates vertices are removed, and each mesh is drawn using 1 single call to glDrawElements
   - each Mesh has only 1 VBO and only 1 IBO (both static)
   - geometry is mostly static

Now on NVidia Quadro cards (1 Gb VRAM) I have:
   - the theoretical limit for vertex buffer size is 1 million vertices (result from glGetIntegerv(GL_MAX_ELEMENTS_VERTICES, &iMaxVertex))
   - some VBOs are already over this limit, but it seems to be fine (some are 16 times this limit). I typically get a 20 to 60 FPS with more than 5 million vertices
   - I have already filled up 1 GB with VBOs and IBOs on the graphic card (yes, the models are THAT big).

The problems I face are :
   - When the amount of models increase, the FPS drops to 10 => probably due to too many VBO & IBO switching
   - the worst problem : after loading many models OR a few very large models, the FPS behaves strangely : it FREEZES for 3 seconds, and then shot back up to 15 or more FPS. And 15 seconds later, it freezes again. So in summary, the problem is "it is sometimes slow, sometimes fast".

Now my 2 questions are:
  Question 1 : what can be the cause of the freezing of the FPS ? It does not happen on ATI cards Radeon 6550M HG

  Question 2 : for optimization, I have a dilemma. I can either:
                a pack everything in 1 VBO and 1 IBO => I will save on glBindBuffer calls, but I will have 2 monstrous buffers on the GPU, clearly over the maximal size returned by glGetIntegerv(GL_MAX_ELEMENTS_VERTICES, &iMaxVertex)
                b pack all data in several VBOs and IBOs, with each VBO and IBO having 95% of the max size returned by glGetInteger. I will have more state switching than with the 1st option, but less than currently.               Since both options are rather heavy to implement, I would like to know if some among you have experience with this

situation. I am more leaning on option b, but it is no so convenient to split big meshes.

Thanks in advance for the help !


Share this post

Link to post
Share on other sites
RobinsonUK    108
You say you've implemented KD tree, but what are the constraints on your view? I mean give us a bit of a clue about the use case for your engine.

Share this post

Link to post
Share on other sites
seb_seb0    100
OK, so here is one use case (where freezing occurs):
- largest model : 16 051 900 vertices, 17 081 685 triangles (drawn with GL_TRIANGLES => 51 245 055 indices)
=> largest VBO : 428 Mo (28 bytes per vertex)
largest IBO : 195 Mo (4 bytes per indice)
You can see that the model is optimized : there are more triangles than vertices => most of vertices are shared.

- 410 models loaded (410 VBOs + 410 IBOs) : see gDEBugger GL extract attached for a list. You will see there are very big models, and small ones
- 1 Gb memory used...

For the engine, here is how it works:
- draw loop :
For each frame
Compute frustum planes
Do Frustum Culling (with KdTree), using bounding sphere tests against plane (very fast)
Do Pixel culling (with Bounding sphere) => 1 distance calculation (ouch : sqrt), 2 multiplications, 1 division, 1 comparison
Draw the remaining meshes : draw opaques, draw transparents, draw lines

- For frustum culling : I start with the KdTree root node, and recursively traverse the tree until a leaf node is found in the viewing frustum. I use plane / sphere distance calculation.

- For KdTree generation : I use the Surface Area Heuristic. It is done on model level (the meshes are not splitted)

- performance : at first, my models were composed of 1 or more VBOs. I have quickly seen it was leading to a performance bottleneck => I solved the problem by merging all VBOs into 1, removing duplicated & ununsed vertices. So I know for sure that too many VBOs are provoking a performance bottleneck

- pixel culling: it is clearly a performance boost, expecially when many small models are loaded (screws, nuts, ...)
For pixel culling, the minimal size is different if the camera is moving or if it is static.

- multipass rendering : after culling, I draw in this order: opaques triangles, transparent triangles, edges of the meshes. When moving, edges are not drawned.

I will try first anyway to pack everything in 1 VBO and 1 IBO : it is the simplest to implement, and it might just work.

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Similar Content

    • By Zaphyk
      I am developing my engine using the OpenGL 3.3 compatibility profile. It runs as expected on my NVIDIA card and on my Intel Card however when I tried it on an AMD setup it ran 3 times worse than on the other setups. Could this be a AMD driver thing or is this probably a problem with my OGL code? Could a different code standard create such bad performance?
    • By Kjell Andersson
      I'm trying to get some legacy OpenGL code to run with a shader pipeline,
      The legacy code uses glVertexPointer(), glColorPointer(), glNormalPointer() and glTexCoordPointer() to supply the vertex information.
      I know that it should be using setVertexAttribPointer() etc to clearly define the layout but that is not an option right now since the legacy code can't be modified to that extent.
      I've got a version 330 vertex shader to somewhat work:
      #version 330 uniform mat4 osg_ModelViewProjectionMatrix; uniform mat4 osg_ModelViewMatrix; layout(location = 0) in vec4 Vertex; layout(location = 2) in vec4 Normal; // Velocity layout(location = 3) in vec3 TexCoord; // TODO: is this the right layout location? out VertexData { vec4 color; vec3 velocity; float size; } VertexOut; void main(void) { vec4 p0 = Vertex; vec4 p1 = Vertex + vec4(Normal.x, Normal.y, Normal.z, 0.0f); vec3 velocity = (osg_ModelViewProjectionMatrix * p1 - osg_ModelViewProjectionMatrix * p0).xyz; VertexOut.velocity = velocity; VertexOut.size = TexCoord.y; gl_Position = osg_ModelViewMatrix * Vertex; } What works is the Vertex and Normal information that the legacy C++ OpenGL code seem to provide in layout location 0 and 2. This is fine.
      What I'm not getting to work is the TexCoord information that is supplied by a glTexCoordPointer() call in C++.
      What layout location is the old standard pipeline using for glTexCoordPointer()? Or is this undefined?
      Side note: I'm trying to get an OpenSceneGraph 3.4.0 particle system to use custom vertex, geometry and fragment shaders for rendering the particles.
    • By markshaw001
      Hi i am new to this forum  i wanted to ask for help from all of you i want to generate real time terrain using a 32 bit heightmap i am good at c++ and have started learning Opengl as i am very interested in making landscapes in opengl i have looked around the internet for help about this topic but i am not getting the hang of the concepts and what they are doing can some here suggests me some good resources for making terrain engine please for example like tutorials,books etc so that i can understand the whole concept of terrain generation.
    • By KarimIO
      Hey guys. I'm trying to get my application to work on my Nvidia GTX 970 desktop. It currently works on my Intel HD 3000 laptop, but on the desktop, every bind textures specifically from framebuffers, I get half a second of lag. This is done 4 times as I have three RGBA textures and one depth 32F buffer. I tried to use debugging software for the first time - RenderDoc only shows SwapBuffers() and no OGL calls, while Nvidia Nsight crashes upon execution, so neither are helpful. Without binding it runs regularly. This does not happen with non-framebuffer binds.
      GLFramebuffer::GLFramebuffer(FramebufferCreateInfo createInfo) { glGenFramebuffers(1, &fbo); glBindFramebuffer(GL_FRAMEBUFFER, fbo); textures = new GLuint[createInfo.numColorTargets]; glGenTextures(createInfo.numColorTargets, textures); GLenum *DrawBuffers = new GLenum[createInfo.numColorTargets]; for (uint32_t i = 0; i < createInfo.numColorTargets; i++) { glBindTexture(GL_TEXTURE_2D, textures[i]); GLint internalFormat; GLenum format; TranslateFormats(createInfo.colorFormats[i], format, internalFormat); // returns GL_RGBA and GL_RGBA glTexImage2D(GL_TEXTURE_2D, 0, internalFormat, createInfo.width, createInfo.height, 0, format, GL_FLOAT, 0); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST); DrawBuffers[i] = GL_COLOR_ATTACHMENT0 + i; glBindTexture(GL_TEXTURE_2D, 0); glFramebufferTexture(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0 + i, textures[i], 0); } if (createInfo.depthFormat != FORMAT_DEPTH_NONE) { GLenum depthFormat; switch (createInfo.depthFormat) { case FORMAT_DEPTH_16: depthFormat = GL_DEPTH_COMPONENT16; break; case FORMAT_DEPTH_24: depthFormat = GL_DEPTH_COMPONENT24; break; case FORMAT_DEPTH_32: depthFormat = GL_DEPTH_COMPONENT32; break; case FORMAT_DEPTH_24_STENCIL_8: depthFormat = GL_DEPTH24_STENCIL8; break; case FORMAT_DEPTH_32_STENCIL_8: depthFormat = GL_DEPTH32F_STENCIL8; break; } glGenTextures(1, &depthrenderbuffer); glBindTexture(GL_TEXTURE_2D, depthrenderbuffer); glTexImage2D(GL_TEXTURE_2D, 0, depthFormat, createInfo.width, createInfo.height, 0, GL_DEPTH_COMPONENT, GL_FLOAT, 0); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST); glBindTexture(GL_TEXTURE_2D, 0); glFramebufferTexture(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, depthrenderbuffer, 0); } if (createInfo.numColorTargets > 0) glDrawBuffers(createInfo.numColorTargets, DrawBuffers); else glDrawBuffer(GL_NONE); if (glCheckFramebufferStatus(GL_FRAMEBUFFER) != GL_FRAMEBUFFER_COMPLETE) std::cout << "Framebuffer Incomplete\n"; glBindFramebuffer(GL_FRAMEBUFFER, 0); width = createInfo.width; height = createInfo.height; } // ... // FBO Creation FramebufferCreateInfo gbufferCI; gbufferCI.colorFormats =; gbufferCI.depthFormat = FORMAT_DEPTH_32; gbufferCI.numColorTargets = gbufferCFs.size(); gbufferCI.width = engine.settings.resolutionX; gbufferCI.height = engine.settings.resolutionY; gbufferCI.renderPass = nullptr; gbuffer = graphicsWrapper->CreateFramebuffer(gbufferCI); // Bind glBindFramebuffer(GL_DRAW_FRAMEBUFFER, fbo); // Draw here... // Bind to textures glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, textures[0]); glActiveTexture(GL_TEXTURE1); glBindTexture(GL_TEXTURE_2D, textures[1]); glActiveTexture(GL_TEXTURE2); glBindTexture(GL_TEXTURE_2D, textures[2]); glActiveTexture(GL_TEXTURE3); glBindTexture(GL_TEXTURE_2D, depthrenderbuffer); Here is an extract of my code. I can't think of anything else to include. I've really been butting my head into a wall trying to think of a reason but I can think of none and all my research yields nothing. Thanks in advance!
    • By Adrianensis
      Hi everyone, I've shared my 2D Game Engine source code. It's the result of 4 years working on it (and I still continue improving features ) and I want to share with the community. You can see some videos on youtube and some demo gifs on my twitter account.
      This Engine has been developed as End-of-Degree Project and it is coded in Javascript, WebGL and GLSL. The engine is written from scratch.
      This is not a professional engine but it's for learning purposes, so anyone can review the code an learn basis about graphics, physics or game engine architecture. Source code on this GitHub repository.
      I'm available for a good conversation about Game Engine / Graphics Programming
  • Popular Now