Jump to content
  • Advertisement
Sign in to follow this  
Sigvatr

OpenGL OpenGL Performance and Optimization

This topic is 2599 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi there,

Like many other poor, unfortunate souls, my OpenGL game engine was based off of the skin and bones of well-known tutorials and articles on the web, which resulted in the engine being heavily based on slow techniques such as glBegin/End loops and other immediate mode junk. I have the feeling now that the performance of my engine can be improved significantly due to its lackluster performance at the moment.

I was wondering what techniques I can use to speed up my engine. Right now, the GL code is scattered in and out of the engine's internal logic and systems, so I'm wondering if GL is getting confused or lagging because it is depending on my engine to handle a bunch of stuff for every frame while it is trying to run in and of itself.

I don't know too much about it, but is there a way to encapsulate all of my OpenGL code in a place where the engine internals and logic aren't slowing it down or messing it up? Someone mentioned that a lot of my OpenGL code is running in software mode, which is not good. I think the general idea should be to get as much of it as possible into hardware mode.

I think I most probably should handle things in this fashion:

1) Perform non-OpenGL, engine and game stuff (or have them in a seperate thread entirely).
2) Update an intermediary "buffer" of data that translates non-OpenGL data (ie. model and sprite positions, animations, deformations etc.) into data that OpenGL can execute when it is time to render (using vertex arrays, vertex buffers, etc.)
3) Render all of the OpenGL stuff without having to manipulate any data (as all of the manipulation has been performed prior to the render call).

The jist that I am getting from people is that I need to stop using glBegin/End loops for literally everything and make as much use from vertex arrays and vertex buffers as possible. I've also been advised that display lists are deprecated, so I think I will try and avoid those.

However, as much as I think I should avoid software (non-OpenGL) processing in the middle of the OpenGL rendering process, there are a few things that I think will be unavoidable, such as setting uniforms in shaders and things like that.

Anyway, basically I'm just looking for pointers on how to optimize my rendering. Am I correct about the things I have stated above? I think that to increase the performance of my engine I need to do the following:

1) Keep the non-OpenGL and OpenGL processing as seperate from one another as possible (and probably in different threads at some point).
2) Translate all necessary data into vertex arrays and vertex buffers before the OpenGL rendering process.
3) Perform all the OpenGL rendering at once.

Feedback very much appreciated. I'd also like to know what aspects of OpenGL are handled in software and what are handled in hardware.

Cheers,
- Sig

Share this post


Link to post
Share on other sites
Advertisement

I was wondering what techniques I can use to speed up my engine. Right now, the GL code is scattered in and out of the engine's internal logic and systems, so I'm wondering if GL is getting confused or lagging because it is depending on my engine to handle a bunch of stuff for every frame while it is trying to run in and of itself.
[/quote]

I don't know where you came up with this, but it sounds like BS to me. OpenGL doesn't get 'confused'.


The jist that I am getting from people is that I need to stop using glBegin/End loops for literally everything and make as much use from vertex arrays and vertex buffers as possible. I've also been advised that display lists are deprecated, so I think I will try and avoid those.
[/quote]

This, on the other hand, is a good idea. Immediate mode is terribly slow compared to using vertex buffers. One thing that can slow down opengl is making lots and lots of opengl calls, which is what immediate mode is all about. If you can render a mesh with 5-10 calls that's a big improvement over per-vertex opengl commands.


However, as much as I think I should avoid software (non-OpenGL) processing in the middle of the OpenGL rendering process, there are a few things that I think will be unavoidable, such as setting uniforms in shaders and things like that.
[/quote]
This is wrong. It doesn't matter what you do in between opengl calls. An opengl call just puts a little data in a pushbuffer for the gpu to fetch on its own. This happens totally separate to what's going on in your client side program, and opengl doesn't care if you do other things in between calls. In fact I could imagine this being slightly faster then batching up all of the calls at the end of the frame, as you're giving the GPU more time from when you start handing it work before you're waiting for it to finish to swap the frame. Still you may prefer to put all the opengl code together just for the sake of organization and your sanity :)

Share this post


Link to post
Share on other sites
Your engine probably won't get that much faster if u only modularize everything( divide the rendering from the game logic), but it is highly encouraged, because of all the benefits u get working with it.
Adding new features will become allot more easy and so on.

From a performance perspective you want to talk as less to the GPU as possible, the GPU shouldn't be spending half its time processing commands or data from your engine it should just render things :-)
So storing your geometry data in BufferObjects is of course the first step, because u will only send the data once. The second thing is to somehow order your rendering so as less OpenGL state changes are made as possible(changeing Shader, Textures ...)

Share this post


Link to post
Share on other sites
First you should bring all your OpenGL code to one central point, it much easier to have all your rendering code together and you'll need it to be together for some optimizations you can do later.
Start using vertex buffers but dont build them every frame, most of your vertex buffers you will only have to fill on load time (just like textures)

If this isn't a big improvement enough for you you can think about culling, no need to make render calls for objects you wont even see.
Finally you can do sorting / batching, since you got all your rendering in a central place now you can do this relatively easy.
About having a rendering thread i'm not sure but if you insist you can look into command buffers.

Share this post


Link to post
Share on other sites


However, as much as I think I should avoid software (non-OpenGL) processing in the middle of the OpenGL rendering process, there are a few things that I think will be unavoidable, such as setting uniforms in shaders and things like that.

This is wrong. It doesn't matter what you do in between opengl calls. An opengl call just puts a little data in a pushbuffer for the gpu to fetch on its own. This happens totally separate to what's going on in your client side program, and opengl doesn't care if you do other things in between calls. In fact I could imagine this being slightly faster then batching up all of the calls at the end of the frame, as you're giving the GPU more time from when you start handing it work before you're waiting for it to finish to swap the frame. Still you may prefer to put all the opengl code together just for the sake of organization and your sanity :)
[/quote]


This isn't really true, in fact the OP is more on the ball here.
The first thing is that GPU drivers will buffer up a couple of frames worth of calls before they are dispatched. So you might have to issue quite a few draw calls before the card starts processing what you ask it to. (I believe both AMD and NV buffer at least about 2 frames worth of data before getting under way).

In fact the better way to go about it is to buffer up all your GL/D3D calls until you have a list of what you want to process and then blast through it; the reason for this is because each call transistions from your code to the
driver and back, by batching everything up and doing it in one go you get better cache usage on the CPU which can be a win.

An increasingly common method of dealing with this is to figure out what you want to draw and put it into a sorted list (well, some form of contiguous memory buffer anyway), then when you have all this up front you can do the least amount of work possible to render things as you can batch up state block setting and instanced draw calls to make good use of data and instruction caches.

(This setup also lets you assemble the draw list over multiple threads before using a final thread to kick the draw off which can speed things up vs trying to sort and draw everything on a single thread).

All that said the amount you want to worry about all that depends on just how hard you want/need to push the system. Assembling a draw list across multiple threads for example is an extreme end optimisation as it certainly won't be easy to do and do right/quickly.

Share this post


Link to post
Share on other sites
Structuring your code better is definitely highly recommended but it's not going to get you anything if you're bottlenecked elsewhere. The first thing you need to do is get away from immediate mode; even just a simple switch to vertex arrays will be of benefit to you here. You also need to examine your code carefully for any potential software fallbacks and cases which could cause your CPU and GPU to need to sync. On modern hardware you can quite easily fall back to software in some places by just using formats that are not natively supported by your hardware. A classic example is glTexSubImage2D; if you're using than anywhere in your code, and if you're using a GL_RGB format, you're probably falling back to software for texture updates. Anything that needs to read back from the GPU will cause a sync so watch out for glReadPixels, occlusion queries, etc.

OpenGL.org has a good Common Mistakes page that you can use a reference for sanity-checking your code; it's explicit purpose is to address bad code that people have picked up from tutorials: "Quite a few websites show the same mistakes and the mistakes presented in their tutorials are copied and pasted by those who want to learn OpenGL" and some of the items discussed on it will probably ring a bell with you.

Share this post


Link to post
Share on other sites
occlusion queries have a callback that lets you know when the result is ready
GL_RGB...? can someone verify this? because ive heard the exact opposite

Share this post


Link to post
Share on other sites
I have no knowledge about those stuff, but I also heard/read some times about using BGR instead of RGB for some performance reasons

Share this post


Link to post
Share on other sites

I have no knowledge about those stuff, but I also heard/read some times about using BGR instead of RGB for some performance reasons


From what I understand, BGR is used for some image formats such as TGA files. Although I am sure there is more to it than just that.

Share this post


Link to post
Share on other sites
See: http://www.opengl.or...and_pixel_reads

And if you are interested, most GPUs like chunks of 4 bytes. In other words, RGBA or BGRA is prefered. RGB and BGR is considered bizarre since most GPUs, most CPUs and any other kind of chip don't handle 24 bits. This means, the driver converts your RGB or BGR to what the GPU prefers, which typically is BGRA.[/quote]
Also read the section following it on "Image Precision" - http://www.opengl.org/wiki/Common_Mistakes#Image_precision

I can produce a small test app that confirms this 100% - GL_BGRA is up to 6 times faster than GL_RGB on even NVIDIA hardware. Your hardware stores textures internally in BGRA order (more or less no matter what you specify for internalformat) so sending data in any other format will cause a slowdown by requiring it to be converted. Formats like GL_RGB are nothing more than cruddy old crap left over from the days of SGI workstations and 3DFX cards.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
  • Advertisement
  • Popular Tags

  • Similar Content

    • By McGrane
      Hey
      My laptop recently decided to die, so Ive been transferring my project to my work laptop just to get it up to date, and commit it. I was banging my head against the wall all day, as my textures where not displaying in my program- I was getting no errors and no indication of why it was occurring so I have been just trying to figure it out- I know the image loading was working ok, as im using image data elsewhere, I was pretty confident that the code was fine also, as ive never had an issue with displaying textures before, so I thought it might be the drivers on this laptop, (my old one was just using the built in IntelHD, while this laptop has a NVIDIA graphics card) but all seems to be up to date.
      Below are my basic shaders:
      Vertex Shader
      #version 330 core layout(location = 0) in vec3 position; layout(location = 1) in vec3 color; layout(location = 2) in vec3 normal; layout(location = 3) in vec2 texCoord; uniform mat4 Projection; uniform mat4 Model; out vec3 Color; out vec3 Normal; out vec2 TexCoord; void main() { gl_Position = Projection * Model * vec4( position, 1.0 ); Color = color; Normal = normal; TexCoord = vec2( texCoord.x, texCoord.y); } Fragment Shader
      #version 330 core in vec3 Color; in vec3 Normal; in vec2 TexCoord; uniform sampler2D textureData; void main() { vec4 textureColor = texture( textureData, TexCoord ); vec4 finalColor = textureColor * vec4( Color, 1.0f); gl_FragColor = finalColor; } Calling Code
      glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, textureID); glUniform1i(glGetUniformLocation(shaderID, "textureData"), textureID); Now this is the part i dont understand, I worked through my program, until I got to the above 'Calling Code'. This just displays a black texture.. my original issue. Out of desperation, I just tried changing the name in glGetUniformLocation from "textureData" to "textureData_invalid" to see if my error checks would through up something, but in actual fact, it is now displaying the texture as expected. Can anyone fathom a guess as too why this is occurring.. im assuming the random text is just picking up the correct location by c++ witchcraft, but why is the original one not getting picked up correctly and/or not working as expected
      I realize more code is probably needed to see how it all hangs together.. but it seems to come down to this as the issue
    • By QQemka
      Hello. So far i got decently looking 3d scene. I also managed to render a truetype font, on my way to implementing gui (windows, buttons and textboxes). There are several issues i am facing, would love to hear your feedback.
      1) I render text using atlas with VBO containing x/y/u/v of every digit in the atlas (calculated basing on x/y/z/width/height/xoffset/yoffset/xadvance data in binary .fnt format file, screenshot 1). I generated a Comic Sans MS with 32 size and times new roman with size 12 (screenshot 2 and 3). The first issue is the font looks horrible when rescaling. I guess it is because i am using fixed -1 to 1 screen space coords. This is where ortho matrix should be used, right?
      2) Rendering GUI. Situation is similar to above. I guess the widgets should NOT scale when scaling window, am i right? So what am i looking for is saying "this should be always in the middle, 200x200 size no matter the display window xy", and "this should stick to the bottom left corner". Is ortho matrix the cure for all such problems?
      3) The game is 3D but i have to go 2D to render static gui elements over the scene - and i want to do it properly! At the moment i am using matrix 3x3 for 2d transformations and vec3 for all kinds of coordinates. In shaders tho i technically still IS 3D. I have to set all 4 x y z w of the gl_Position while it would be much much more conventient to... just do the maths in 2d space. Can i achieve it somehow?
      4) Text again. I am kind of confused what is the reason of artifacts in Times New Roman font displaying (screenshot 1). I render from left to right, letter after letter. You can clearly see that letters on the right (so the ones rendered after ones on the left are covered by the previous one). I was toying around with blending options but no luck. I do not support kerning at the moment but that's definitely not the cause of error. The display of the small font looks dirty aliased too. I am knd of confused how to interpret the integer data and how should be scaled/adapted to the screen view. Is it just store the data as constant size and again - use ortho matrix?
      Thanks in advance for all your ideas and suggestions!
      https://i.imgur.com/4rd1VC3.png
      https://i.imgur.com/uHrSXfe.png
      https://i.imgur.com/xRTffPn.png
    • By plz717
      Hello, everyone! I hope my problem isn't too 'beginnerish'. I'm doing research on motion synthesis now, trying to implement the Deep Mimic paper (DeepMimic) by BINPENG XUE, in this paper, I need to first retarget character A's motion to another character B to make the reference motion clips for character B, since we don't have character B‘s reference motion. The most important thing is that in the paper, the author copied character A's joint's rotation with respective to joint's local coordinate system (not the parent) to character B. In my personal understanding, the joint's rotation with respective to joint's local coordinate system is something like that in the attached photo, where for the Elbow joint, i need to get the Elbow's rotation in the elbow's local coordinate system (i'm very grateful for you to share your ideas if i have misunderstanding about it 🙂)
      I have searched many materials on the internet about how to extract the local joint's information from FBX, the most relative one i found is the pivot rotation( and geometric transformation, object offset transformation). I'm a beginner in computer graphics, and i'm confused about whether the pivot rotation( or geometric transformation, object offset transformation) is exactly the joint's local rotation i'm seeking? I hope someone that have any ideas can help me, I'd be very grateful for any pointers in the right direction. Thanks in advance! 

    • By nOoNEE
      hello guys , i have some questions  what does glLinkProgram  and  glBindAttribLocation do?  i searched but there wasnt any good resource 
    • By owenjr
      Hi, I'm a Multimedia Engineering student. I am about to finish my dergree and I'm already thinking about what topic to cover in my final college project.
      I'm interested in the procedural animation with c++ and OpenGL of creatures, something like a spider for example. Can someone tell me what are the issues I should investigate to carry it out? I understand that it has some dependence on artificial intelligence but I do not know to what extent. Can someone help me to find information about it? Thank you very much.
       
      Examples: 
      - Procedural multi-legged walking animation
      - Procedural Locomotion of Multi-Legged Characters in Dynamic Environments
    • By Lewa
      So, i'm still on my quest to unterstanding the intricacies of HDR and implementing this into my engine. Currently i'm at the step to implementing tonemapping. I stumbled upon this blogposts:
      http://filmicworlds.com/blog/filmic-tonemapping-operators/
      http://frictionalgames.blogspot.com/2012/09/tech-feature-hdr-lightning.html
      and tried to implement some of those mentioned tonemapping methods into my postprocessing shader.
      The issue is that none of them creates the same results as shown in the blogpost which definitely has to do with the initial range in which the values are stored in the HDR buffer. For simplicity sake i store the values between 0 and 1 in the HDR buffer (ambient light is 0.3, directional light is 0.7)
      This is the tonemapping code:
      vec3 Uncharted2Tonemap(vec3 x) { float A = 0.15; float B = 0.50; float C = 0.10; float D = 0.20; float E = 0.02; float F = 0.30; return ((x*(A*x+C*B)+D*E)/(x*(A*x+B)+D*F))-E/F; } This is without the uncharted tonemapping:
      This is with the uncharted tonemapping:
      Which makes the image a lot darker.
      The shader code looks like this:
      void main() { vec3 color = texture2D(texture_diffuse, vTexcoord).rgb; color = Uncharted2Tonemap(color); //gamma correction (use only if not done in tonemapping code) color = gammaCorrection(color); outputF = vec4(color,1.0f); } Now, from my understanding is that tonemapping should bring the range down from HDR to 0-1.
      But the output of the tonemapping function heavily depends on the initial range of the values in the HDR buffer. (You can't expect to set the sun intensity the first time to 10 and the second time to 1000 and excpect the same result if you feed that into the tonemapper.) So i suppose that this also depends on the exposure which i have to implement?
      To check this i plotted the tonemapping curve:
      You can see that the curve goes only up to around to a value of 0.21 (while being fed a value of 1) and then basically flattens out. (which would explain why the image got darker.)
       
      My guestion is: In what range should the values in the HDR buffer be which then get tonemapped? Do i have to bring them down to a range of 0-1 by multiplying with the exposure?
      For example, if i increase the values of the light by 10 (directional light would be 7 and ambient light 3) then i would need to divide HDR values by 10 in order to get a value range of 0-1 which then could be fed into the tonemapping curve. Is that correct?
    • By nOoNEE
      i am reading this book : link
      in the OpenGL Rendering Pipeline section there is a picture like this: link
      but the question is this i dont really understand why it is necessary to turn pixel data in to fragment and then fragment into pixel could please give me a source or a clear Explanation that why it is necessary ? thank you so mu
       
       
    • By Inbar_xz
      I'm using the OPENGL with eclipse+JOGL.
      My goal is to create movement of the camera and the player.
      I create main class, which create some box in 3D and hold 
      an object of PlayerAxis.
      I create PlayerAxis class which hold the axis of the player.
      If we want to move the camera, then in the main class I call to 
      the func "cameraMove"(from PlayerAxis) and it update the player axis.
      That's work good.
      The problem start if I move the camera on 2 axis, 
      for example if I move with the camera right(that's on the y axis)
      and then down(on the x axis) -
      in some point the move front is not to the front anymore..
      In order to move to the front, I do
      player.playerMoving(0, 0, 1);
      And I learn that in order to keep the front move, 
      I need to convert (0, 0, 1) to the player axis, and then add this.
      I think I dont do the convert right.. 
      I will be glad for help!

      Here is part of my PlayerAxis class:
       
      //player coordinate float x[] = new float[3]; float y[] = new float[3]; float z[] = new float[3]; public PlayerAxis(float move_step, float angle_move) { x[0] = 1; y[1] = 1; z[2] = -1; step = move_step; angle = angle_move; setTransMatrix(); } public void cameraMoving(float angle_step, String axis) { float[] new_x = x; float[] new_y = y; float[] new_z = z; float alfa = angle_step * angle; switch(axis) { case "x": new_z = addVectors(multScalar(z, COS(alfa)), multScalar(y, SIN(alfa))); new_y = subVectors(multScalar(y, COS(alfa)), multScalar(z, SIN(alfa))); break; case "y": new_x = addVectors(multScalar(x, COS(alfa)), multScalar(z, SIN(alfa))); new_z = subVectors(multScalar(z, COS(alfa)), multScalar(x, SIN(alfa))); break; case "z": new_x = addVectors(multScalar(x, COS(alfa)), multScalar(y, SIN(alfa))); new_y = subVectors(multScalar(y, COS(alfa)), multScalar(x, SIN(alfa))); } x = new_x; y = new_y; z = new_z; normalization(); } public void playerMoving(float x_move, float y_move, float z_move) { float[] move = new float[3]; move[0] = x_move; move[1] = y_move; move[2] = z_move; setTransMatrix(); float[] trans_move = transVector(move); position[0] = position[0] + step*trans_move[0]; position[1] = position[1] + step*trans_move[1]; position[2] = position[2] + step*trans_move[2]; } public void setTransMatrix() { for (int i = 0; i < 3; i++) { coordiTrans[0][i] = x[i]; coordiTrans[1][i] = y[i]; coordiTrans[2][i] = z[i]; } } public float[] transVector(float[] v) { return multiplyMatrixInVector(coordiTrans, v); }  
      and in the main class i have this:
       
      public void keyPressed(KeyEvent e) { if (e.getKeyCode()== KeyEvent.VK_ESCAPE) { System.exit(0); //player move } else if (e.getKeyCode()== KeyEvent.VK_W) { //front //moveAmount[2] += -0.1f; player.playerMoving(0, 0, 1); } else if (e.getKeyCode()== KeyEvent.VK_S) { //back //moveAmount[2] += 0.1f; player.playerMoving(0, 0, -1); } else if (e.getKeyCode()== KeyEvent.VK_A) { //left //moveAmount[0] += -0.1f; player.playerMoving(-1, 0, 0); } else if (e.getKeyCode()== KeyEvent.VK_D) { //right //moveAmount[0] += 0.1f; player.playerMoving(1, 0, 0); } else if (e.getKeyCode()== KeyEvent.VK_E) { //moveAmount[0] += 0.1f; player.playerMoving(0, 1, 0); } else if (e.getKeyCode()== KeyEvent.VK_Q) { //moveAmount[0] += 0.1f; player.playerMoving(0, -1, 0); //camera move } else if (e.getKeyCode()== KeyEvent.VK_I) { //up player.cameraMoving(1, "x"); } else if (e.getKeyCode()== KeyEvent.VK_K) { //down player.cameraMoving(-1, "x"); } else if (e.getKeyCode()== KeyEvent.VK_L) { //right player.cameraMoving(-1, "y"); } else if (e.getKeyCode()== KeyEvent.VK_J) { //left player.cameraMoving(1, "y"); } else if (e.getKeyCode()== KeyEvent.VK_O) { //right round player.cameraMoving(-1, "z"); } else if (e.getKeyCode()== KeyEvent.VK_U) { //left round player.cameraMoving(1, "z"); } }  
      finallt found it.... i confused with the transformation matrix row and col. thanks anyway!
    • By Lewa
      So, i'm currently trying to implement an SSAO shader from THIS tutorial and i'm running into a few issues here.
      Now, this SSAO method requires view space positions and normals. I'm storing the normals in my deferred renderer in world-space so i had to do a conversion and reconstruct the position from the depth buffer.
      And something there goes horribly wrong (which has probably to do with worldspace to viewspace transformations).
      (here is the full shader source code if someone wants to take a look at it)
      Now, i suspect that the normals are the culprit.
      vec3 normal = ((uNormalViewMatrix*vec4(normalize(texture2D(sNormals, vTexcoord).rgb),1.0)).xyz); "sNormals" is a 2D texture which stores the normals in world space in a RGB FP16 buffer.
      Now i can't use the camera viewspace matrix to transform the normals into viewspace as the cameras position isn't set at (0,0,0), thus skewing the result.
      So what i did is to create a new viewmatrix specifically for this normal without the position at vec3(0,0,0);
      //"camera" is the camera which was used for rendering the normal buffer renderer.setUniform4m(ressources->shaderSSAO->getUniform("uNormalViewMatrix"), glmExt::createViewMatrix(glm::vec3(0,0,0),camera.getForward(),camera.getUp())//parameters are (position,forwardVector,upVector) ); Though i have the feeling this is the wrong approach. Is this right or is there a better/correct way of transforming a world space normal into viewspace?
    • By HawkDeath
      Hi,
      I'm trying mix two textures using own shader system, but I have a problem (I think) with uniforms.
      Code: https://github.com/HawkDeath/shader/tree/test
      To debug I use RenderDocs, but I did not receive good results. In the first attachment is my result, in the second attachment is what should be.
      PS. I base on this tutorial https://learnopengl.com/Getting-started/Textures.


  • Advertisement
  • Popular Now

  • Forum Statistics

    • Total Topics
      631398
    • Total Posts
      2999840
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!