Sign in to follow this  

OpenGL What is batching?

This topic is 4585 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I have heard the term batching several times before, and was wondering what it was (and how it can be applied in OpenGL -- though, I don't want this post to be OpenGL specific). From the very little I have found (at last, I found a topic that google let me down on!), I think it has something to do with organizing the rendering of your geometry to make as few changes as you possibly can. So, for an example, I thought a simple 2d tile engine would make sense. I personally would rendering it with a triangle list (maybe quad list?). Each quad would have its own texture (hence the tile engine). Does it make sense to sort the quads by their texture, and use GL_QUADS for each group? Please, help a confused nubblet. Thanks.

Share this post


Link to post
Share on other sites
I am not sure that what you describe is necessarily batching. However sorting geometry to minimize the number of opengl state changes is definitely a good thing to do for any moderately complex scene, I think.

I was under the impression that batching meant sending a bunch of polygons at the same time to the hardware (or at least the opengl layer) in order to maximize the throughput to the vertex pipelines.

Please correct me as I want to know this as well.

Rob

Share this post


Link to post
Share on other sites
Good batching means to take advantage of as much coherence in your scene as possible, mainly in order to reduce CPU overhead.

You can look at each render state as a bunch of sort keys. The first question is which key do you sort your scene by?

object #1 :
Render target : A
blending mode : K
z enable : true

object #2 :
Render target : A
blending mode : K
z enable : false

object #3 :
Render target : B
blending mode : K
z enable : true

object #4 :
Render target : B
blending mode : K
z enable : true

etc.

Objects 3 & 4 have the same render states, so they can be drawn one after another without changing state in between, or they could even be drawn in one draw call in some cases ( like if they were both in world space ).

It is easy to worry too much about this, but I have found that thinking about it a bit when you start your engine can help in the long run.

Many games still draw small pieces of geometry in separate calls. It is preferable for things like clumps of grass or rocks, or maybe even buildings in an RTS, to put them all in a big dynamic vertex buffer and draw them all at once.

For my particle system, I group all particles that were created together ( like from an explosion ), as one object. I cull or draw all of them in one call. That is an example of good batching. Poor batching would be to draw each particle one at a time.

This is more of an issue in DirectX, but it still can help in OGL as well.

Share this post


Link to post
Share on other sites
Quote:
I cull or draw all of them in one call.


What? You mean like instancing? I thought OpenGL didn't do that...

Share this post


Link to post
Share on other sites
In DirectX (you said you did not want this thread to be OpenGL specific) a batch is typically a single DrawPrimitives or DrawIndexedPrimitives (DrawXXX etc..) call. The more of these calls you make the more likely you are to be CPU bound. So batching in this context is not just reducing state changes, but rendering as much of your geometry as possible in a single call. A single call implies that the states must be the same for all the geometry in the call. There are a number of techinques and tricks for increasing batch size -- many of them the same as the techinques for decreasing state changes, save that even one cheep state change will split your batch.

Instancing, as an example technique, seems to be directly targeted at increasing batch size. Texture packing is another technique that applies to batching and perhaps to your 2d problem. To pack your textures you combine your textures into one and then adjust your texcoords so that each geometry uses only portion of the combined texture. This should allow you to draw a number of objects, with different textures, in one call.

The last paper I read from ATI said that a 1GHz box would be CPU bound with 25,000 batches a second (DirectX). I don't know how all this applies to opengl, but I have heard that it is not nearly as important.

Share this post


Link to post
Share on other sites
Typically you only really need to worry about expensive state changes - which boils down to texture binds and shader binds. Other state tends to be much more trivial to change, and you should instead worry about grouping individual polys into chunks o geometry to render at that point.

If you've got some really expensive shaders a rough front-to-back z sort can be good too.

Share this post


Link to post
Share on other sites
Quote:
The last paper I read from ATI said that a 1GHz box would be CPU bound with 25,000 batches a second (DirectX). I don't know how all this applies to opengl, but I have heard that it is not nearly as important.

here im getting ~10 million batches sec with opengl (athln64 2.0ghz gffx5900)

Share this post


Link to post
Share on other sites
I'll give an example where grouping by render state was more helpful that just reducing draw calls.

In my engine, I created per-caster shadow maps, so each shadow caster had its own 64x64 shadow map. I would render to it, then draw the floor using this texture, then do the next character, etc.

It was way too slow - like 80 fps with only one light and a couple of characters.

I changed it so that all characters allocated a 64x64 chunk on a 256x256 shadow map instead, drew all characters into this map, changing some shader constants and the viewport in between, then drew the receiver geometry with each sub-texture in turn.

This brought me up to ~200 fps. So, I still made the same # of draw calls, but I was able to avoid switching render targets, so it was a huge win.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Similar Content

    • By xhcao
      Does sync be needed to read texture content after access texture image in compute shader?
      My simple code is as below,
      glUseProgram(program.get());
      glBindImageTexture(0, texture[0], 0, GL_FALSE, 3, GL_READ_ONLY, GL_R32UI);
      glBindImageTexture(1, texture[1], 0, GL_FALSE, 4, GL_WRITE_ONLY, GL_R32UI);
      glDispatchCompute(1, 1, 1);
      // Does sync be needed here?
      glUseProgram(0);
      glBindFramebuffer(GL_READ_FRAMEBUFFER, framebuffer);
      glFramebufferTexture2D(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
                                     GL_TEXTURE_CUBE_MAP_POSITIVE_X + face, texture[1], 0);
      glReadPixels(0, 0, kWidth, kHeight, GL_RED_INTEGER, GL_UNSIGNED_INT, outputValues);
       
      Compute shader is very simple, imageLoad content from texture[0], and imageStore content to texture[1]. Does need to sync after dispatchCompute?
    • By Jonathan2006
      My question: is it possible to transform multiple angular velocities so that they can be reinserted as one? My research is below:
      // This works quat quaternion1 = GEQuaternionFromAngleRadians(angleRadiansVector1); quat quaternion2 = GEMultiplyQuaternions(quaternion1, GEQuaternionFromAngleRadians(angleRadiansVector2)); quat quaternion3 = GEMultiplyQuaternions(quaternion2, GEQuaternionFromAngleRadians(angleRadiansVector3)); glMultMatrixf(GEMat4FromQuaternion(quaternion3).array); // The first two work fine but not the third. Why? quat quaternion1 = GEQuaternionFromAngleRadians(angleRadiansVector1); vec3 vector1 = GETransformQuaternionAndVector(quaternion1, angularVelocity1); quat quaternion2 = GEQuaternionFromAngleRadians(angleRadiansVector2); vec3 vector2 = GETransformQuaternionAndVector(quaternion2, angularVelocity2); // This doesn't work //quat quaternion3 = GEQuaternionFromAngleRadians(angleRadiansVector3); //vec3 vector3 = GETransformQuaternionAndVector(quaternion3, angularVelocity3); vec3 angleVelocity = GEAddVectors(vector1, vector2); // Does not work: vec3 angleVelocity = GEAddVectors(vector1, GEAddVectors(vector2, vector3)); static vec3 angleRadiansVector; vec3 angularAcceleration = GESetVector(0.0, 0.0, 0.0); // Sending it through one angular velocity later in my motion engine angleVelocity = GEAddVectors(angleVelocity, GEMultiplyVectorAndScalar(angularAcceleration, timeStep)); angleRadiansVector = GEAddVectors(angleRadiansVector, GEMultiplyVectorAndScalar(angleVelocity, timeStep)); glMultMatrixf(GEMat4FromEulerAngle(angleRadiansVector).array); Also how do I combine multiple angularAcceleration variables? Is there an easier way to transform the angular values?
    • By dpadam450
      I have this code below in both my vertex and fragment shader, however when I request glGetUniformLocation("Lights[0].diffuse") or "Lights[0].attenuation", it returns -1. It will only give me a valid uniform location if I actually use the diffuse/attenuation variables in the VERTEX shader. Because I use position in the vertex shader, it always returns a valid uniform location. I've read that I can share uniforms across both vertex and fragment, but I'm confused what this is even compiling to if this is the case.
       
      #define NUM_LIGHTS 2
      struct Light
      {
          vec3 position;
          vec3 diffuse;
          float attenuation;
      };
      uniform Light Lights[NUM_LIGHTS];
       
       
    • By pr033r
      Hello,
      I have a Bachelor project on topic "Implenet 3D Boid's algorithm in OpenGL". All OpenGL issues works fine for me, all rendering etc. But when I started implement the boid's algorithm it was getting worse and worse. I read article (http://natureofcode.com/book/chapter-6-autonomous-agents/) inspirate from another code (here: https://github.com/jyanar/Boids/tree/master/src) but it still doesn't work like in tutorials and videos. For example the main problem: when I apply Cohesion (one of three main laws of boids) it makes some "cycling knot". Second, when some flock touch to another it scary change the coordination or respawn in origin (x: 0, y:0. z:0). Just some streng things. 
      I followed many tutorials, change a try everything but it isn't so smooth, without lags like in another videos. I really need your help. 
      My code (optimalizing branch): https://github.com/pr033r/BachelorProject/tree/Optimalizing
      Exe file (if you want to look) and models folder (for those who will download the sources):
      http://leteckaposta.cz/367190436
      Thanks for any help...

    • By Andrija
      I am currently trying to implement shadow mapping into my project , but although i can render my depth map to the screen and it looks okay , when i sample it with shadowCoords there is no shadow.
      Here is my light space matrix calculation
      mat4x4 lightViewMatrix; vec3 sun_pos = {SUN_OFFSET * the_sun->direction[0], SUN_OFFSET * the_sun->direction[1], SUN_OFFSET * the_sun->direction[2]}; mat4x4_look_at(lightViewMatrix,sun_pos,player->pos,up); mat4x4_mul(lightSpaceMatrix,lightProjMatrix,lightViewMatrix); I will tweak the values for the size and frustum of the shadow map, but for now i just want to draw shadows around the player position
      the_sun->direction is a normalized vector so i multiply it by a constant to get the position.
      player->pos is the camera position in world space
      the light projection matrix is calculated like this:
      mat4x4_ortho(lightProjMatrix,-SHADOW_FAR,SHADOW_FAR,-SHADOW_FAR,SHADOW_FAR,NEAR,SHADOW_FAR); Shadow vertex shader:
      uniform mat4 light_space_matrix; void main() { gl_Position = light_space_matrix * transfMatrix * vec4(position, 1.0f); } Shadow fragment shader:
      out float fragDepth; void main() { fragDepth = gl_FragCoord.z; } I am using deferred rendering so i have all my world positions in the g_positions buffer
      My shadow calculation in the deferred fragment shader:
      float get_shadow_fac(vec4 light_space_pos) { vec3 shadow_coords = light_space_pos.xyz / light_space_pos.w; shadow_coords = shadow_coords * 0.5 + 0.5; float closest_depth = texture(shadow_map, shadow_coords.xy).r; float current_depth = shadow_coords.z; float shadow_fac = 1.0; if(closest_depth < current_depth) shadow_fac = 0.5; return shadow_fac; } I call the function like this:
      get_shadow_fac(light_space_matrix * vec4(position,1.0)); Where position is the value i got from sampling the g_position buffer
      Here is my depth texture (i know it will produce low quality shadows but i just want to get it working for now):
      sorry because of the compression , the black smudges are trees ... https://i.stack.imgur.com/T43aK.jpg
      EDIT: Depth texture attachment:
      glTexImage2D(GL_TEXTURE_2D, 0,GL_DEPTH_COMPONENT24,fbo->width,fbo->height,0,GL_DEPTH_COMPONENT,GL_FLOAT,NULL); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE); glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_TEXTURE_2D, fbo->depthTexture, 0);
  • Popular Now