Sign in to follow this  
Vincent_M

OpenGL OpenGL 3.0+ And VAOs

Recommended Posts

Vincent_M    969

I'm finally learning OpenGL above 2.1, which would require some extra driver knowledge in Linux, and a Hail Mary from Apple in regards to OpenGL 4.0 and above. What I'm wondering is: what are the important features differences between OpenGL 3.x and OpenGL 4.x? My guess is that OpenGL 4.2 (I think) provides geometry shaders which allows for hardware batching. In other words, if I had a model of a character, I could render dozens of instances of that character in 1 draw call per mesh in that model. Another are compute shaders in OpenGL 4.3 which is a nice replacement for OpenCL, more of an equivalent to DirectX 11, and possibly VERY useful for processing audio samples for interesting DSP effects that'd typically be handled by the motherboard's audio hardware. There are also 3D textures, and better techniques for rendering volumetric clouds, from what I've heard. Are there any other interesting features to look out for while I learn about OpenGL?

 

Now, how do vertex array objects (VAOs) work, exactly? From what I've read so far, they preserve vertex state, and by vertex state, I think it means the state of which vertex arrays are enabled. For example, if I have a model that is composed of 5 meshes, and the vertex format for all meshes are the same: position, texture coordinate and normal. So, when setting up my vertex array, I'd generate a VAO, bind to it, enable the first 3 vertex attribute arrays, then unbind. Now, when I wanted to draw the model, I'd just bind to that VAO again, bind my VBOs containing references to the vertex data, and call glDrawElements(). I no longer need to call glEnableVertexAttribArray() or glDisableVertexAttribArray() whenever I draw something because the VAO I've just bound preserves which vertex arrays to enable/disable --effectively batching, or rather, caching those calls into a single gl* call.

 

Then, there are VBOs... VBOs are completely separate from VAOs. A VBO must be generated per vertex attribute, whether they're separate arrays, or interleaved via structures, blobs, etc. Then, I may have an IBO (index buffer array) if my vertices are indexed, but again, has nothing to do with VAOs. VAOs only cache which vertex attribute arrays are enabled. Is that correct?

 

NOTE: If this is correct, would it make sense to no store VAOs on a per-model basis, but at a graphics context basis instead. If I have 5 different models that all happen to have the same number of vertex attribute arrays enabled, then I'd create 1 VAO that'd enable the first 3 vertex attribute arrays, bound once, render all instances of all 5 models, then bind to another VAO that uses a different number of arrays.

 

EDIT: I think I just realized something. So, I'd generate a new VAO, then bind it to configure it. At this point, I'd enable all the attribute arrays needed, and then generate, bind and fill my VBOs/IBOs. Then, I'd also setup glVertexAttribPointer() per attribute to specify the starting address for each attribute in the VBO, or VBOs if I'm going the array-per-attribute route. Finally, I'd unbind for safety. Then, when I want to draw something, it's a matter of setting the correct shader, setting the uniforms (probably with UBOs, but I haven't read that far yet), binding the VAO, and then drawing with glDrawArrays() or glDrawElements(). So, VAOs would greatly reduce the amount of gl* calls by caching these commands in a VAO, which serves similarly to a mini-command buffer that could be modified or calls on-the-fly. Which, if this is correct, then do binding VAOs introduce any type of scope for binding VBOs? For example, if I bound a VBO while a VAO is bound, once I bound the current VAO to zero, would it revert the currently-bound VBO to whatever VBO I was bound to when I wasn't in VAO scope? Does all of this sound about right?

Edited by Vincent_M

Share this post


Link to post
Share on other sites
Kaptein    2224

You don't have to, and in many cases don't want to use UBOs. I think the support is still very shaky, but I actually don't know the specifics. I have limited enviroments to test on.

 

Here is my VAO implementation:

https://github.com/fwsGonzo/library/blob/master/include/library/opengl/vao.hpp

https://github.com/fwsGonzo/library/blob/master/library/opengl/vao.cpp

 

Just like you said:

generate VAO

bind VAO

 

note that you don't want to enable attribs here, because you have no VBO bound

an enabled attrib is bound to the VBO you have bound, which means you can in theory have several VBOs with vertex data

 

generate VBO & IBO

bind VBO

upload data

enable attribs (use offsetof(struct, x))

 

(potentially bind IBO & upload data)

 

done. no need to unbind anything.

if you are using a wrapper for VAOs FBOs Textures and Shaders, these wrappers should manage this for you

 

Note that my implementation isn't 100% perfect. I even spotted grey areas just skimming through it right now, eg. indexes() doesn't do a bind() to make guarantee the IBO bind to work correctly. But it will hopefully give you an idea of how it all works.

 

When you upload data to a VBO you have a choice between GL_STATIC_DRAW and GL_STREAM_DRAW, the former for when your mesh is static and the latter for when you are re-uploading the data frequently. There are other flags, but afaik the drivers don't care.

 

So, with all that said, here are some tips:

1. You never really disable an attrib array, as you would just instead use a shader that doesn't utilize the specific attribute.

2. You should avoid unbinding anything, unless you absolutely have to.

3. Don't fall into the immediate mode trap for screenspace shaders, as suddenly glEnable(old_shit) matters, like GL_TEXTURE_2D.

I avoided this trap myself by having a very useful createScreenspace() function in my VAO implementation. :) Laziness > all.

 

Yes, when you unbind a VAO, you are suddenly back in old/VBO territory with gl*Pointer stuff, I guess. If you are in compatibility mode, like most people are.

Edited by Kaptein

Share this post


Link to post
Share on other sites
Xycaleth    2391

What I'm wondering is: what are the important features differences between OpenGL 3.x and OpenGL 4.x?

 

The main differences between the latest OpenGL 3 and 4 versions off the top of my head are...

  • Tessellation shaders
  • Compute shaders
  • Support for 64-bit floats (doubles) in shaders
  • Separable shader objects - you essentially mix and match shaders in different parts of the graphics pipeline, similar to D3D.
  • Direct state access - no longer have to bind-to-edit
  • Shader storage buffer objects (shader-readable/-writeable memory buffers)
  • Indirect rendering
  • Immutable buffers and textures

Direct state access is part of OpenGL 4.5 which came out just a few weeks ago so unless you have a relatively new Nvidia card, won't be available to you. For rendering multiple of the same model, you would use regular instancing (e.g. glDrawArraysInstanced, glDrawElementsInstanced, etc). With an array of model matrices as uniforms, and the gl_InstanceID variable in your vertex shader, you can then index into the array of matrices to position each instance differently smile.png

 

 

Now, how do vertex array objects (VAOs) work, exactly? [...] Then, there are VBOs..

Think of VAOs as containers for vertex attributes (and I guess for convenience, an index buffer). Each vertex attribute then describes where to fetch its data from, how much data to read each time, how many bytes to skip between each element, and so forth. And you can have multiple vertex attributes, like your position, texture coordinates, or even arbitrary data that is needed per-vertex (or per-instance*). So the VAO contains all of that information. Every time you bind the VAO, all this information is used in the subsequent draw calls until you bind a different VAO. I have found some drivers are a bit buggy in that they don't keep the index buffer, so you might need to rebind your index buffer every time you bind your VAO as well...

 

* Vertex attributes can be per-instance by using glVertexAttribDivisor, which tells GL to advance the attribute read-pointer every N instances.

 

Where do VBOs come into it, you ask? Each vertex attribute has a "data source" which is your VBO, so you can use a single vertex buffer for all your attributes, or use a different vertex buffer for each attribute, or a mixture.

 

Perhaps beyond the scope of what you need or intend to do (but I'll add it anyway because I think it's something to consider), is a different way of thinking about VAOs which I came across a few months ago [1]. If instead of creating a VAO per-object, you create a VAO per-vertex format, you can reduce the number of glBindVertexArray calls (which in the driver would reduce the number of buffer changes). In order to do this, you would need to create a very large vertex buffer (a few tens of megabytes) and store all your models in this vertex buffer which had the same vertex format. Each model (or model sub-mesh) would then then also need a base vertex, which is the "offset" in the VBO to start rendering from. So instead of Bind VAO, Draw, Bind VAO, Draw, Bind VAO, Draw, you now end up with Bind VAO, Draw, Draw, Draw, which not only cuts your GL calls in half pretty much, but also the number of potential buffer switches.

 

Eventually, you see the same can be applied to UBOs as well. Create a large UBO, and describe each 'chunk' with an offset and size. You can take it even further, and allocate a single large buffer, and use different ranges of it as your VBO, IBO and UBO! At this point, you're basically managing your own GPU buffer memory biggrin.png

 

[1] http://www.ogre3d.org/forums/viewtopic.php?p=506783&sid=f629b3848582844ecb131a120ba21659#p506783 The poster, gsellers, is Graham Sellers from AMD.

Edited by Xycaleth

Share this post


Link to post
Share on other sites
Vincent_M    969

Alright, thanks guys. I think I'm getting the hang of it. I've been busy the last 2 weeks with work and the gym, so I've rarely had the time to reply back, let alone test it out. I was able to try out VAOs and VBOs yesterday, and things are starting to click.

 

 


So, with all that said, here are some tips:
1. You never really disable an attrib array, as you would just instead use a shader that doesn't utilize the specific attribute.
2. You should avoid unbinding anything, unless you absolutely have to.
3. Don't fall into the immediate mode trap for screenspace shaders, as suddenly glEnable(old_shit) matters, like GL_TEXTURE_2D.
I avoided this trap myself by having a very useful createScreenspace() function in my VAO implementation. Laziness > all.

Thanks for clarifying about the unbinding part --that makes sense. By "screenspace shader", are you talking about post-processing? Also, does the OpenGL 4.x core spec eventually get rid of glEnable()/glDisable() entirely?

 

@Xycaleth, you bring up a good point on storing everything on a per-format basis. This could reduce the amount of gl* calls, which is always a good thing. These objects might have to be divided up into draw calls due to other factors such as drawing with/without depth, with/without blending, with/without lighting, etc. Btw, 

 

 

 


Perhaps beyond the scope of what you need or intend to do (but I'll add it anyway because I think it's something to consider), is a different way of thinking about VAOs which I came across a few months ago [1]. If instead of creating a VAO per-object, you create a VAO per-vertex format, you can reduce the number of glBindVertexArray calls (which in the driver would reduce the number of buffer changes).

 

This also brings up another question I was wondering: do VAOs provide more efficiency, or are they there for convenience for programmers? It sounds like VAOs are more of a shortcut for programmers to draw stuff to the screen without having to worry about enabling the correct attribute arrays, setting pointers, binding buffers, etc. Instead, VAOs do that for us, obviously, but under the hood, are VAOs really the equivalent of us doing that ourselves meaning they increasing programmer productivity instead of GPU performance? Or, is it caching the commands in a batched way similar to how GL 4.5's DSM methodology will be taking us?

 

At this point, I'm all theory though! I've been reading quite a bit online, books and making posts. I really need to make time to sit down, and write code lol.

 

EDIT: I noticed the Graham Sellers link you posted after writing this, and I'm starting to think that VAOs are in fact what my theory was:

 


Traditional APIs which generally have a function call per state change encourage bad behavior as seen by the GPU. Wrapping blobs of state into state objects or pushing the work of building them onto other threads only addresses the CPU side of the problem. The GPU still eats the same work. In some cases, it will eat more - the big, monolithic state object approach is likely to push a lot of redundancy into the pipe because a large number of states will be the same between objects.

 

I should have mentioned this before, but my theory is that if VAOs are merely there for productivity, then there could be more GPU overhead because now you have the VAO buffer that's eating up precious video memory, and yet another buffer swap to deal with, but it should cut down on CPU-side overhead as less gl* calls are being made. Is this correct?

 

EDIT 2: Back in my OpenGL ES 2.0 days, I didn't really mess with gl buffers much. Now that I've read the Graham Sellers article, I'm starting to realize that they can be looked at as just another memory map. My uber shader methodology, as nasty as it was, sounds it's still the fastest alternative. In fact, it sounds like some of OpenGL 4's features don't really make OpenGL 4 much faster in terms of performance except maybe batching... Does OpenGL 4.3's batching features help with that?

Edited by Vincent_M

Share this post


Link to post
Share on other sites
Xycaleth    2391

Also, does the OpenGL 4.x core spec eventually get rid of glEnable()/glDisable() entirely?

Starting with the introduction of the core spec, some glEnable/glDisable enums are no longer relevant. The reason for this was the move to a programmable pipeline. Take texturing for example. In a fixed function pipeline, you can bind a texture, specify texture coordinates, specify vertex colours, but it's up to you to tell the API whether you want to use texturing by using glEnable(GL_TEXTURE_2D); Compare this with the programmable pipeline: if you don't want to use texturing, then your shaders will not use any texture sampling functions. If you do want to use texturing, then the shaders will use the sampling functions.

 

This also brings up another question I was wondering: do VAOs provide more efficiency, or are they there for convenience for programmers?

VAOs are purely a software feature (as far as I've seen), that is, the GPU doesn't have any knowledge of them. They're supposed to cut down on time spent validating the vertex attributes, switching buffers, but YMMV. Here's a good write up on when benefits can be seen or not seen: http://www.openglsuperbible.com/2013/12/09/vertex-array-performance/

 


In fact, it sounds like some of OpenGL 4's features don't really make OpenGL 4 much faster in terms of performance except maybe batching... Does OpenGL 4.3's batching features help with that?

If by batching, you mean instancing, then this is available since 3.1. I'm not sure what else you could mean :P

Share this post


Link to post
Share on other sites
Kaptein    2224

A minor comment I would add:

There are many new things in OpenGL which help you reduce bugs too. The less states you worry about the better.

Even so, many of these things were already solved by creating your own wrapper classes that deals with all of this, and it continues to be true now.

 

The new features in 4.x allow more batching, so you have to investigate whether or not you can rewrite parts of your pipeline to utilize these new features, or whether you should just keep using the old proven way. There are some new ways of batching though which I think is easier to (short term) leverage than say going full AZDO approach.

 

Look at the AZDO presentation (google) to see which order you should render things in, then figure out which features make sense for you and go from there.

Short of using any synchronizing functions (such as glGet*) that stalls the entire pipeline, you're going to be fine. AZDO requires GL 4.4 btw. I think.

 

Advice about minimizing state changes and batching as much as possible is always true, but it's really only to help programmers make good architectural decisions.

Share this post


Link to post
Share on other sites
Vincent_M    969

 
Vincent_M, on 31 Aug 2014 - 3:21 PM, said:
This also brings up another question I was wondering: do VAOs provide more efficiency, or are they there for convenience for programmers?
VAOs are purely a software feature (as far as I've seen), that is, the GPU doesn't have any knowledge of them. They're supposed to cut down on time spent validating the vertex attributes, switching buffers, but YMMV. Here's a good write up on when benefits can be seen or not seen: http://www.openglsuperbible.com/2013/12/09/vertex-array-performance/

I did see that post, and it looks like there are efficiency benefits for VAOs, but if it's just software, then I find it kind of unnecessary outside of it being forced upon you in OpenGL 4.x. My own state manager was a wrapper for whenever I switched FBOs, shader programs, VBOs, textures, glEnable/Disable, and enabling/disabling vertex arrays. The way the vertex array portion worked was that whenever I swapped my shader, and my GraphicsContext class recognized it as swapping to a different shader than the one currently in use, it'd enable/disable the difference vertex arrays from the last bound shader because GraphicsContext also has its own client-side set of bools to keep track of which attribute arrays were currently active internally.

 

For example, let's just say my currently-bound shader only requires 1 vertex attribute array enabled, so only array 0 would be activated. Then, let's say later on in the frame I need to activate my lit-and-textured shader that takes 3 attribute arrays. It'd activate arrays 1 and 2 only since 0 was already activated. Then, when the next frame is drawn, and I need to go back to the single attribute array shader, it'll swap, and deactivate attribute arrays 1 and 2 all. This is simple to the user drawing something because all they have to do is call GraphicsContext::UseProgram(Shader *shader), and pass in the shader object they require. Now, I'm not sure how efficient the software implementation is, but if my objects were grouped up by shader, then by state, etc you're really not calling glEnableVertexAttribArray()/Disable too much! Now, glVertexAttribArrayPointer() gets called per legit shader swap, however, but there's ways of further optimizing that using the massive VBO buffer mentioned above, and also referenced in Graham Sellers' post above.

 


Look at the AZDO presentation (google) to see which order you should render things in, then figure out which features make sense for you and go from there.
Short of using any synchronizing functions (such as glGet*) that stalls the entire pipeline, you're going to be fine. AZDO requires GL 4.4 btw. I think.

Ironically, I haven't needed to use any glGet* functions outside of glGetString(GL_VERSION) at startup to print the implementation string for logging purposes. The guys over at Steam mentioned in their video regarding porting their engine over from DirectX to OpenGL that their Source Engine uses glGet* for nearly ever state query they need as they believe that all states systems deviate, at least slightly. I can see how this is true in some cases of the OpenGL State, but when it comes to things, such as glEnable/Disable, writing a wrapper for setting/getting has always worked for me. Of course, my engine only assumes single-context rendering...

 

But yeah, GraphicsContext::SetGLState(unsigned int state, bool enable) -> pass in ANYTHING, and internally, it'll check if that state's value is in an STL vector already for enabling, or check if does not exist for disabling. If enabling, but the state doesn't exist in the STL vector, then call glEnable, and add it to the vector of states. If disabling, it'll check to see if the state is in the vector, in which case it'll remove it from the STL vector and call glDisable. The method even returns a bool on if it successfully state changes or not. Same with GraphicsContext::UseProgram(Shader *shader), GraphicsContext::SetActiveTexture(int target, Texture *texture), I have one for FBOs, etc.

 

This cut down quite a bit of gl* calls in generate on mobile devices using OpenGL ES 2.0, and I could assume it'll only do more justice on desktop environments with instancing.

Edited by Vincent_M

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Similar Content

    • By pseudomarvin
      I assumed that if a shader is computationally expensive then the execution is just slower. But running the following GLSL FS instead just crashes
      void main() { float x = 0; float y = 0; int sum = 0; for (float x = 0; x < 10; x += 0.00005) { for (float y = 0; y < 10; y += 0.00005) { sum++; } } fragColor = vec4(1, 1, 1 , 1.0); } with unhandled exception in nvoglv32.dll. Are there any hard limits on the number of steps/time that a shader can take before it is shut down? I was thinking about implementing some time intensive computation in shaders where it would take on the order of seconds to compute a frame, is that possible? Thanks.
    • By Arulbabu Donbosco
      There are studios selling applications which is just copying any 3Dgraphic content and regenerating into another new window. especially for CAVE Virtual reality experience. so that the user opens REvite or CAD or any other 3D applications and opens a model. then when the user selects the rendered window the VR application copies the 3D model information from the OpenGL window. 
      I got the clue that the VR application replaces the windows opengl32.dll file. how this is possible ... how can we copy the 3d content from the current OpenGL window.
      anyone, please help me .. how to go further... to create an application like VR CAVE. 
       
      Thanks
    • By cebugdev
      hi all,

      i am trying to build an OpenGL 2D GUI system, (yeah yeah, i know i should not be re inventing the wheel, but this is for educational and some other purpose only),
      i have built GUI system before using 2D systems such as that of HTML/JS canvas, but in 2D system, i can directly match a mouse coordinates to the actual graphic coordinates with additional computation for screen size/ratio/scale ofcourse.
      now i want to port it to OpenGL, i know that to render a 2D object in OpenGL we specify coordiantes in Clip space or use the orthographic projection, now heres what i need help about.
      1. what is the right way of rendering the GUI? is it thru drawing in clip space or switching to ortho projection?
      2. from screen coordinates (top left is 0,0 nd bottom right is width height), how can i map the mouse coordinates to OpenGL 2D so that mouse events such as button click works? In consideration ofcourse to the current screen/size dimension.
      3. when let say if the screen size/dimension is different, how to handle this? in my previous javascript 2D engine using canvas, i just have my working coordinates and then just perform the bitblk or copying my working canvas to screen canvas and scale the mouse coordinates from there, in OpenGL how to work on a multiple screen sizes (more like an OpenGL ES question).
      lastly, if you guys know any books, resources, links or tutorials that handle or discuss this, i found one with marekknows opengl game engine website but its not free,
      Just let me know. Did not have any luck finding resource in google for writing our own OpenGL GUI framework.
      IF there are no any available online, just let me know, what things do i need to look into for OpenGL and i will study them one by one to make it work.
      thank you, and looking forward to positive replies.
    • By fllwr0491
      I have a few beginner questions about tesselation that I really have no clue.
      The opengl wiki doesn't seem to talk anything about the details.
       
      What is the relationship between TCS layout out and TES layout in?
      How does the tesselator know how control points are organized?
          e.g. If TES input requests triangles, but TCS can output N vertices.
             What happens in this case?
      In this article,
      http://www.informit.com/articles/article.aspx?p=2120983
      the isoline example TCS out=4, but TES in=isoline.
      And gl_TessCoord is only a single one.
      So which ones are the control points?
      How are tesselator building primitives?
    • By Orella
      I've been developing a 2D Engine using SFML + ImGui.
      Here you can see an image
      The editor is rendered using ImGui and the scene window is a sf::RenderTexture where I draw the GameObjects and then is converted to ImGui::Image to render it in the editor.
      Now I need to create a 3D Engine during this year in my Bachelor Degree but using SDL2 + ImGui and I want to recreate what I did with the 2D Engine. 
      I've managed to render the editor like I did in the 2D Engine using this example that comes with ImGui. 
      3D Editor preview
      But I don't know how to create an equivalent of sf::RenderTexture in SDL2, so I can draw the 3D scene there and convert it to ImGui::Image to show it in the editor.
      If you can provide code will be better. And if you want me to provide any specific code tell me.
      Thanks!
  • Popular Now