Defend

OpenGL Rule of thumb for VAO & VBO usage?

Recommended Posts

I've had a Google around for this but haven't yet found some solid advice. There is a lot of "it depends", but I'm not sure on what.

My question is what's a good rule of thumb to follow when it comes to creating/using VBOs & VAOs? As in, when should I use multiple or when should I not? My understanding so far is that if I need a new VBO, then I need a new VAO. So when it comes to rendering multiple objects I can either:

* make lots of VAO/VBO pairs and flip through them to render different objects, or

* make one big VBO and jump around its memory to render different objects. 

I also understand that if I need to render objects with different vertex attributes, then a new VAO is necessary in this case.

If that "it depends" really is quite variable, what's best for a beginner with OpenGL, assuming that better approaches can be learnt later with better understanding?

 

Edited by Defend

Share this post


Link to post
Share on other sites

I've seen comments about storing all UI vertex data in one big VBO. This is unnecessary since you can just store 4 vertices that form a 1 by 1 rectangle, and then translate and scale it to the correct position and dimensions. You can also use this VBO with only 4 vertices for billboards, and particle effects (assuming you don't use a point sprite instead). This will save space on the GPU, but sending data from the CPU (after translating and scaling it) to the shader on the GPU will slow the game down. So, it may be more efficient to store all billboards, and particle effects combined together in one separate VAO & VBO.

As far as I know one huge buffer is slower to work with than multiple smaller buffers that equal the same size as the huge one, you probably won't have to worry about this if each model (not just a rectangle, but a complex model) is stored in its own VAO & VBO.

A short summary: It depends on how much data you store in the buffer, and what you're going to do with it, and the game's performance. Also any like models can use the same VAO & VBO, just translate it to the correct position. For example: boxes or a few chests scattered around a room or an area. Also, if your environment is not too huge you can combine static environment models into one VBO. I would only do this for small areas not an open world.

Share this post


Link to post
Share on other sites
10 hours ago, Yxjmir said:

I've seen comments about storing all UI vertex data in one big VBO. This is unnecessary since you can just store 4 vertices that form a 1 by 1 rectangle, and then translate and scale it to the correct position and dimensions.

Streaming data into a VBO from the CPU is very cheap - each map/unmap/draw has a coat, but writing the actual data is very cheap (100k verts a frame shouldn't break a sweat - but 100k draw calls of one quad each is impossibly slow). 

If you translate and scale it using a matrix, then the translation/scale data will be bigger than the extra vertices that you saved. If you tightly pack the translation/scale data in a uniform it will be smaller, but setting a uniform value is extremely expensive compared to writing a few extra vertices to a mapped buffer. Not to mention that sending that data via uniforms would mean that you're performing one draw call per quad instead of one draw call for the entire UI. 

The modern way to avoid those extra vertices is to fill a VBO with the per-quad data (position, size, uv area) - one entry per-quad instead of 4 vertices. Then bind the VBO to your shader as a TBO instead of using the VAO to bind it. In the vertex shader you can then read from the buffer manually using offset 'gl_VertexID / 4' (aka id>>2) to get the per-quad data, and then use 'gl_VertexID % 4' (aka id&3) to determine which corner is being processed / whether to offset positively or negatively by half the size. 

Share this post


Link to post
Share on other sites
3 hours ago, Hodgman said:

Streaming data into a VBO from the CPU is very cheap - each map/unmap/draw has a coat, but writing the actual data is very cheap (100k verts a frame shouldn't break a sweat - but 100k draw calls of one quad each is impossibly slow). 

If you translate and scale it using a matrix, then the translation/scale data will be bigger than the extra vertices that you saved. If you tightly pack the translation/scale data in a uniform it will be smaller, but setting a uniform value is extremely expensive compared to writing a few extra vertices to a mapped buffer. Not to mention that sending that data via uniforms would mean that you're performing one draw call per quad instead of one draw call for the entire UI. 

The modern way to avoid those extra vertices is to fill a VBO with the per-quad data (position, size, uv area) - one entry per-quad instead of 4 vertices. Then bind the VBO to your shader as a TBO instead of using the VAO to bind it. In the vertex shader you can then read from the buffer manually using offset 'gl_VertexID / 4' (aka id>>2) to get the per-quad data, and then use 'gl_VertexID % 4' (aka id&3) to determine which corner is being processed / whether to offset positively or negatively by half the size. 

So, lets say we had a scene with several boxes, and chests, and maybe other static objects, was I right that it would be better to store all of that (environment included) in one VBO? Also, is there an max size that a VBO can be, or is it just the maximum number of vertices your card can support?

Share this post


Link to post
Share on other sites
51 minutes ago, Yxjmir said:

So, lets say we had a scene with several boxes, and chests, and maybe other static objects, was I right that it would be better to store all of that (environment included) in one VBO? Also, is there an max size that a VBO can be, or is it just the maximum number of vertices your card can support?

Probably -- putting it in the same VBO means you can (potentially) draw it all in a single draw-call, which means there's less CPU overhead (less calls to GL functions) and the GPU will be happy (lots of parallel work to do in one big batch). To achieve this you've also got to merge all your materials into a single UBO/etc, bind all the textures at once, etc... Texture arrays can help for this, and you can put material data into a TBO instead of a UBO, or just arrays inside an UBO and index them.

There is a max size and a way to query it in GL (I forget how exactly :o ). On ancient cards the limit might be a few megabytes, and on modern cards the limit will be a few gigabytes.

If you do need multiple draw-calls, you can still use one VBO though -- draw functions take an offset parameter which specifies where to start reading from the VBO. You can put one model at the start of the VBO and another in the middle. Then you can draw them both without having to change VBO/VAO bindings in between.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


  • Forum Statistics

    • Total Topics
      628282
    • Total Posts
      2981822
  • Similar Content

    • By mellinoe
      Hi all,
      First time poster here, although I've been reading posts here for quite a while. This place has been invaluable for learning graphics programming -- thanks for a great resource!
      Right now, I'm working on a graphics abstraction layer for .NET which supports D3D11, Vulkan, and OpenGL at the moment. I have implemented most of my planned features already, and things are working well. Some remaining features that I am planning are Compute Shaders, and some flavor of read-write shader resources. At the moment, my shaders can just get simple read-only access to a uniform (or constant) buffer, a texture, or a sampler. Unfortunately, I'm having a tough time grasping the distinctions between all of the different kinds of read-write resources that are available. In D3D alone, there seem to be 5 or 6 different kinds of resources with similar but different characteristics. On top of that, I get the impression that some of them are more or less "obsoleted" by the newer kinds, and don't have much of a place in modern code. There seem to be a few pivots:
      The data source/destination (buffer or texture) Read-write or read-only Structured or unstructured (?) Ordered vs unordered (?) These are just my observations based on a lot of MSDN and OpenGL doc reading. For my library, I'm not interested in exposing every possibility to the user -- just trying to find a good "middle-ground" that can be represented cleanly across API's which is good enough for common scenarios.
      Can anyone give a sort of "overview" of the different options, and perhaps compare/contrast the concepts between Direct3D, OpenGL, and Vulkan? I'd also be very interested in hearing how other folks have abstracted these concepts in their libraries.
    • By aejt
      I recently started getting into graphics programming (2nd try, first try was many years ago) and I'm working on a 3d rendering engine which I hope to be able to make a 3D game with sooner or later. I have plenty of C++ experience, but not a lot when it comes to graphics, and while it's definitely going much better this time, I'm having trouble figuring out how assets are usually handled by engines.
      I'm not having trouble with handling the GPU resources, but more so with how the resources should be defined and used in the system (materials, models, etc).
      This is my plan now, I've implemented most of it except for the XML parts and factories and those are the ones I'm not sure of at all:
      I have these classes:
      For GPU resources:
      Geometry: holds and manages everything needed to render a geometry: VAO, VBO, EBO. Texture: holds and manages a texture which is loaded into the GPU. Shader: holds and manages a shader which is loaded into the GPU. For assets relying on GPU resources:
      Material: holds a shader resource, multiple texture resources, as well as uniform settings. Mesh: holds a geometry and a material. Model: holds multiple meshes, possibly in a tree structure to more easily support skinning later on? For handling GPU resources:
      ResourceCache<T>: T can be any resource loaded into the GPU. It owns these resources and only hands out handles to them on request (currently string identifiers are used when requesting handles, but all resources are stored in a vector and each handle only contains resource's index in that vector) Resource<T>: The handles given out from ResourceCache. The handles are reference counted and to get the underlying resource you simply deference like with pointers (*handle).  
      And my plan is to define everything into these XML documents to abstract away files:
      Resources.xml for ref-counted GPU resources (geometry, shaders, textures) Resources are assigned names/ids and resource files, and possibly some attributes (what vertex attributes does this geometry have? what vertex attributes does this shader expect? what uniforms does this shader use? and so on) Are reference counted using ResourceCache<T> Assets.xml for assets using the GPU resources (materials, meshes, models) Assets are not reference counted, but they hold handles to ref-counted resources. References the resources defined in Resources.xml by names/ids. The XMLs are loaded into some structure in memory which is then used for loading the resources/assets using factory classes:
      Factory classes for resources:
      For example, a texture factory could contain the texture definitions from the XML containing data about textures in the game, as well as a cache containing all loaded textures. This means it has mappings from each name/id to a file and when asked to load a texture with a name/id, it can look up its path and use a "BinaryLoader" to either load the file and create the resource directly, or asynchronously load the file's data into a queue which then can be read from later to create the resources synchronously in the GL context. These factories only return handles.
      Factory classes for assets:
      Much like for resources, these classes contain the definitions for the assets they can load. For example, with the definition the MaterialFactory will know which shader, textures and possibly uniform a certain material has, and with the help of TextureFactory and ShaderFactory, it can retrieve handles to the resources it needs (Shader + Textures), setup itself from XML data (uniform values), and return a created instance of requested material. These factories return actual instances, not handles (but the instances contain handles).
       
       
      Is this a good or commonly used approach? Is this going to bite me in the ass later on? Are there other more preferable approaches? Is this outside of the scope of a 3d renderer and should be on the engine side? I'd love to receive and kind of advice or suggestions!
      Thanks!
    • By nedondev
      I 'm learning how to create game by using opengl with c/c++ coding, so here is my fist game. In video description also have game contain in Dropbox. May be I will make it better in future.
      Thanks.
    • By Abecederia
      So I've recently started learning some GLSL and now I'm toying with a POM shader. I'm trying to optimize it and notice that it starts having issues at high texture sizes, especially with self-shadowing.
      Now I know POM is expensive either way, but would pulling the heightmap out of the normalmap alpha channel and in it's own 8bit texture make doing all those dozens of texture fetches more cheap? Or is everything in the cache aligned to 32bit anyway? I haven't implemented texture compression yet, I think that would help? But regardless, should there be a performance boost from decoupling the heightmap? I could also keep it in a lower resolution than the normalmap if that would improve performance.
      Any help is much appreciated, please keep in mind I'm somewhat of a newbie. Thanks!
    • By test opty
      Hi,
      I'm trying to learn OpenGL through a website and have proceeded until this page of it. The output is a simple triangle. The problem is the complexity.
      I have read that page several times and tried to analyse the code but I haven't understood the code properly and completely yet. This is the code:
       
      #include <glad/glad.h> #include <GLFW/glfw3.h> #include <C:\Users\Abbasi\Desktop\std_lib_facilities_4.h> using namespace std; //****************************************************************************** void framebuffer_size_callback(GLFWwindow* window, int width, int height); void processInput(GLFWwindow *window); // settings const unsigned int SCR_WIDTH = 800; const unsigned int SCR_HEIGHT = 600; const char *vertexShaderSource = "#version 330 core\n" "layout (location = 0) in vec3 aPos;\n" "void main()\n" "{\n" " gl_Position = vec4(aPos.x, aPos.y, aPos.z, 1.0);\n" "}\0"; const char *fragmentShaderSource = "#version 330 core\n" "out vec4 FragColor;\n" "void main()\n" "{\n" " FragColor = vec4(1.0f, 0.5f, 0.2f, 1.0f);\n" "}\n\0"; //******************************* int main() { // glfw: initialize and configure // ------------------------------ glfwInit(); glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3); glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 3); glfwWindowHint(GLFW_OPENGL_PROFILE, GLFW_OPENGL_CORE_PROFILE); // glfw window creation GLFWwindow* window = glfwCreateWindow(SCR_WIDTH, SCR_HEIGHT, "My First Triangle", nullptr, nullptr); if (window == nullptr) { cout << "Failed to create GLFW window" << endl; glfwTerminate(); return -1; } glfwMakeContextCurrent(window); glfwSetFramebufferSizeCallback(window, framebuffer_size_callback); // glad: load all OpenGL function pointers if (!gladLoadGLLoader((GLADloadproc)glfwGetProcAddress)) { cout << "Failed to initialize GLAD" << endl; return -1; } // build and compile our shader program // vertex shader int vertexShader = glCreateShader(GL_VERTEX_SHADER); glShaderSource(vertexShader, 1, &vertexShaderSource, nullptr); glCompileShader(vertexShader); // check for shader compile errors int success; char infoLog[512]; glGetShaderiv(vertexShader, GL_COMPILE_STATUS, &success); if (!success) { glGetShaderInfoLog(vertexShader, 512, nullptr, infoLog); cout << "ERROR::SHADER::VERTEX::COMPILATION_FAILED\n" << infoLog << endl; } // fragment shader int fragmentShader = glCreateShader(GL_FRAGMENT_SHADER); glShaderSource(fragmentShader, 1, &fragmentShaderSource, nullptr); glCompileShader(fragmentShader); // check for shader compile errors glGetShaderiv(fragmentShader, GL_COMPILE_STATUS, &success); if (!success) { glGetShaderInfoLog(fragmentShader, 512, nullptr, infoLog); cout << "ERROR::SHADER::FRAGMENT::COMPILATION_FAILED\n" << infoLog << endl; } // link shaders int shaderProgram = glCreateProgram(); glAttachShader(shaderProgram, vertexShader); glAttachShader(shaderProgram, fragmentShader); glLinkProgram(shaderProgram); // check for linking errors glGetProgramiv(shaderProgram, GL_LINK_STATUS, &success); if (!success) { glGetProgramInfoLog(shaderProgram, 512, nullptr, infoLog); cout << "ERROR::SHADER::PROGRAM::LINKING_FAILED\n" << infoLog << endl; } glDeleteShader(vertexShader); glDeleteShader(fragmentShader); // set up vertex data (and buffer(s)) and configure vertex attributes float vertices[] = { -0.5f, -0.5f, 0.0f, // left 0.5f, -0.5f, 0.0f, // right 0.0f, 0.5f, 0.0f // top }; unsigned int VBO, VAO; glGenVertexArrays(1, &VAO); glGenBuffers(1, &VBO); // bind the Vertex Array Object first, then bind and set vertex buffer(s), //and then configure vertex attributes(s). glBindVertexArray(VAO); glBindBuffer(GL_ARRAY_BUFFER, VBO); glBufferData(GL_ARRAY_BUFFER, sizeof(vertices), vertices, GL_STATIC_DRAW); glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 3 * sizeof(float), (void*)0); glEnableVertexAttribArray(0); // note that this is allowed, the call to glVertexAttribPointer registered VBO // as the vertex attribute's bound vertex buffer object so afterwards we can safely unbind glBindBuffer(GL_ARRAY_BUFFER, 0); // You can unbind the VAO afterwards so other VAO calls won't accidentally // modify this VAO, but this rarely happens. Modifying other // VAOs requires a call to glBindVertexArray anyways so we generally don't unbind // VAOs (nor VBOs) when it's not directly necessary. glBindVertexArray(0); // uncomment this call to draw in wireframe polygons. //glPolygonMode(GL_FRONT_AND_BACK, GL_LINE); // render loop while (!glfwWindowShouldClose(window)) { // input // ----- processInput(window); // render // ------ glClearColor(0.2f, 0.3f, 0.3f, 1.0f); glClear(GL_COLOR_BUFFER_BIT); // draw our first triangle glUseProgram(shaderProgram); glBindVertexArray(VAO); // seeing as we only have a single VAO there's no need to // bind it every time, but we'll do so to keep things a bit more organized glDrawArrays(GL_TRIANGLES, 0, 3); // glBindVertexArray(0); // no need to unbind it every time // glfw: swap buffers and poll IO events (keys pressed/released, mouse moved etc.) glfwSwapBuffers(window); glfwPollEvents(); } // optional: de-allocate all resources once they've outlived their purpose: glDeleteVertexArrays(1, &VAO); glDeleteBuffers(1, &VBO); // glfw: terminate, clearing all previously allocated GLFW resources. glfwTerminate(); return 0; } //************************************************** // process all input: query GLFW whether relevant keys are pressed/released // this frame and react accordingly void processInput(GLFWwindow *window) { if (glfwGetKey(window, GLFW_KEY_ESCAPE) == GLFW_PRESS) glfwSetWindowShouldClose(window, true); } //******************************************************************** // glfw: whenever the window size changed (by OS or user resize) this callback function executes void framebuffer_size_callback(GLFWwindow* window, int width, int height) { // make sure the viewport matches the new window dimensions; note that width and // height will be significantly larger than specified on retina displays. glViewport(0, 0, width, height); } As you see, about 200 lines of complicated code only for a simple triangle. 
      I don't know what parts are necessary for that output. And also, what the correct order of instructions for such an output or programs is, generally. That start point is too complex for a beginner of OpenGL like me and I don't know how to make the issue solved. What are your ideas please? What is the way to figure both the code and the whole program out correctly please?
      I wish I'd read a reference that would teach me OpenGL through a step-by-step method. 
  • Popular Now