Sign in to follow this  

OpenGL Overhead with gl* Calls

Recommended Posts

Vincent_M    969

This question regards OpenGL in general, but I think it's more specific to the mobile side of things were efficiency is especially important. Over the years, I've developed my OpenGL ES games with the mindset that making any sort of gl* call will almost-always query the GPU as mobile implementations of OpenGL are completely hardware-accelerated, like modern implementations of OpenGL. That being said, any thread making a gl* call will be halted while the CPU queries the GPU, waiting for a response. Is this correct so far?


I understand that cutting down on glEnable/glDisable calls by calling by sorting renderable elements with similar states, and also wrapping those two in my own state manager is important. I also do the same when binding buffers and textures. You could call SetTexture(GL_TEXTURE_2D, 0, &tex->name) to bind to the first texture unit in 2D. If that particular texture name, for that target at that texture unit has already been bound, then it won't do it again. This would come in handy when rendering multiple instances of the same model multiple times because it'd call glBindTexture(), glBindBuffer(), etc once for the first model, but all subsequent calls wouldn't because they're all using the same texture/buffer parameters that's common to the loaded model they share. Same for checking shaders. It's pretty common that multiple models might use the same shader in a scene. Image rendering dozens of individual models to the screen, but only having to call glUseProgram() twice each frame instead of once per instance rendered. I mean, since I'm still using OpenGL 2.1 (OpenGL ES 2.0 for mobile), glDrawElements() is called once per mesh per instance of the model drawn. For example, drawing 12 instances of a model with 5 meshes would be 60 draw calls. This could be heavy on mobile until I learn about instancing in higher versions of OpenGL and support OpenGL ES 3.0 on mobile.


My question is: is my managing OpenGL contexts internally in my engine worthwhile? Is it a huge performance hit to call glBindTexture() constantly (especially on mobile), or do OpenGL implementations usually check this already. Should I just focus on keeping draw calls down, or is the way I'm managing my states pretty important too?


From what I've read about OpenGL 4.5, it's going through a significant rewrite to be closer to vender-specific implementations such as AMD's Mantle, NVIDIA's CUDA and even iOS's soon-to-be Metal API (ok, so that one's OS-specific working on providing efficient OpenGL drivers under-the-hood) so we could set a texture at a specified target at a specific active texture unit in one function call instead of 2.

Share this post

Link to post
Share on other sites
Hodgman    51341

That being said, any thread making a gl* call will be halted while the CPU queries the GPU, waiting for a response. Is this correct so far?
No. Most gl calls will just do CPU work in a driver and not communicate with the GPU at all.

The GPU usually lags behind the CPU by about a whole frame (or more), so GPU->CPU data readback is terrible for performance (can instantly halve your framerate). glGet* functions are the scary ones that can cause this kind of thing.


Most gl functions are just setting a small amount of data inside the driver, doing some error checking on the arguments, and setting a dirty flag.

The glDraw* functions then check all of the dirty flags, and generate any required actual native GPU commands (bind this texture, bind this shader, draw these triangles...), telling the GPU how to draw things. This is why draw-calls are expensive on the CPU-side; the driver has to do a lot of work inside the glDraw* functions to figure out what commands need to be written into the command buffer.

These commands aren't sent to the GPU synchronously -- instead they're written into a "command buffer". The GPU asynchronously reads commands from this buffer and executes them, but like I said above, the GPU will usually have about a whole frame's worth of commands buffered up at once, so there's a big delay between the CPU writing commands and the GPU executing them.

Share this post

Link to post
Share on other sites
Vincent_M    969

Ok, so that being said, is it ok to make common, repetitive calls to glEnable(), glDisable(), glUseProgram(), glBindTexture(), etc with the same parameter values, or should I continue to to provide extra logic to reduce the amount of gl* calls being made. I never use glGet* calls unless it's glGetUniformLocation(), and that's just once when my shader is successfully compiled, and loaded.


Apple's docs have stated that it's important to provide our own state machines for GL states in the past, but now I'm starting to think it's meant only to be an alternative to constantly querying the GPU for what states are enabled. It's been years since I've read that, anyway... I learned earlier this year that GL_TEXTURE_2D not longer needs to be called in OpenGL ES 2.0, which I always assumed was necessary as I came from using OpenGL ES 1.1.

Share this post

Link to post
Share on other sites
C0lumbo    4411

I think it's considered good practice to remove unnecessary gl calls by shadowing the state on the application side. Certainly Apple's tools (OpenGLES analyser for instance) explicitly warn you about each and every redundant state change you make, so while we can't know what exactly their driver is doing, it'd be reasonable to assume that each redundant state change you make is causing the driver to actually add extra stuff into the command buffer.

Share this post

Link to post
Share on other sites
tonemgub    2008

Apple's docs have stated that it's important to provide our own state machines for GL states

I'm no expert when it comes to OpenGL, but I think they recommend this mostly because of context resets (from


    If the reset notification behavior is NO_RESET_NOTIFICATION_EXT,
    then the implementation will never deliver notification of reset
    events, and GetGraphicsResetStatusEXT will always return
       [fn1: In this case it is recommended that implementations should
        not allow loss of context state no matter what events occur.
        However, this is only a recommendation, and cannot be relied
        upon by applications.]

Share this post

Link to post
Share on other sites
Stainless    1875

I have a had issues in the past where enable and disable calls had a significant effect on performance.


You have to remember when working in the mobile world that not all devices are created equal.


Even devices with the same exact chipset will probably have a different software stack, and hence different performance.


A classic case is the nightmare of compiling shaders on mobile devices. I have had a case with two devices with the same GPU (Mali) and very similar hardware, one compiled the shader into 317 instructions. The other failed to compile the shader at all as the instruction limit went over 512.


Doing things in the best possible way from day one can really help you down the line. Honestly, it may be boring and a pain in posterior but it is worth all the effort when the game "just runs" on every device you test it on.


There is nothing worse than sitting there trying to figure out why the game crashes on a device that you don't own smile.png

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Partner Spotlight

  • Similar Content

    • By pseudomarvin
      I assumed that if a shader is computationally expensive then the execution is just slower. But running the following GLSL FS instead just crashes
      void main() { float x = 0; float y = 0; int sum = 0; for (float x = 0; x < 10; x += 0.00005) { for (float y = 0; y < 10; y += 0.00005) { sum++; } } fragColor = vec4(1, 1, 1 , 1.0); } with unhandled exception in nvoglv32.dll. Are there any hard limits on the number of steps/time that a shader can take before it is shut down? I was thinking about implementing some time intensive computation in shaders where it would take on the order of seconds to compute a frame, is that possible? Thanks.
    • By Arulbabu Donbosco
      There are studios selling applications which is just copying any 3Dgraphic content and regenerating into another new window. especially for CAVE Virtual reality experience. so that the user opens REvite or CAD or any other 3D applications and opens a model. then when the user selects the rendered window the VR application copies the 3D model information from the OpenGL window. 
      I got the clue that the VR application replaces the windows opengl32.dll file. how this is possible ... how can we copy the 3d content from the current OpenGL window.
      anyone, please help me .. how to go further... to create an application like VR CAVE. 
    • By cebugdev
      hi all,

      i am trying to build an OpenGL 2D GUI system, (yeah yeah, i know i should not be re inventing the wheel, but this is for educational and some other purpose only),
      i have built GUI system before using 2D systems such as that of HTML/JS canvas, but in 2D system, i can directly match a mouse coordinates to the actual graphic coordinates with additional computation for screen size/ratio/scale ofcourse.
      now i want to port it to OpenGL, i know that to render a 2D object in OpenGL we specify coordiantes in Clip space or use the orthographic projection, now heres what i need help about.
      1. what is the right way of rendering the GUI? is it thru drawing in clip space or switching to ortho projection?
      2. from screen coordinates (top left is 0,0 nd bottom right is width height), how can i map the mouse coordinates to OpenGL 2D so that mouse events such as button click works? In consideration ofcourse to the current screen/size dimension.
      3. when let say if the screen size/dimension is different, how to handle this? in my previous javascript 2D engine using canvas, i just have my working coordinates and then just perform the bitblk or copying my working canvas to screen canvas and scale the mouse coordinates from there, in OpenGL how to work on a multiple screen sizes (more like an OpenGL ES question).
      lastly, if you guys know any books, resources, links or tutorials that handle or discuss this, i found one with marekknows opengl game engine website but its not free,
      Just let me know. Did not have any luck finding resource in google for writing our own OpenGL GUI framework.
      IF there are no any available online, just let me know, what things do i need to look into for OpenGL and i will study them one by one to make it work.
      thank you, and looking forward to positive replies.
    • By fllwr0491
      I have a few beginner questions about tesselation that I really have no clue.
      The opengl wiki doesn't seem to talk anything about the details.
      What is the relationship between TCS layout out and TES layout in?
      How does the tesselator know how control points are organized?
          e.g. If TES input requests triangles, but TCS can output N vertices.
             What happens in this case?
      In this article,
      the isoline example TCS out=4, but TES in=isoline.
      And gl_TessCoord is only a single one.
      So which ones are the control points?
      How are tesselator building primitives?
    • By Orella
      I've been developing a 2D Engine using SFML + ImGui.
      Here you can see an image
      The editor is rendered using ImGui and the scene window is a sf::RenderTexture where I draw the GameObjects and then is converted to ImGui::Image to render it in the editor.
      Now I need to create a 3D Engine during this year in my Bachelor Degree but using SDL2 + ImGui and I want to recreate what I did with the 2D Engine. 
      I've managed to render the editor like I did in the 2D Engine using this example that comes with ImGui. 
      3D Editor preview
      But I don't know how to create an equivalent of sf::RenderTexture in SDL2, so I can draw the 3D scene there and convert it to ImGui::Image to show it in the editor.
      If you can provide code will be better. And if you want me to provide any specific code tell me.
  • Popular Now