Jump to content
  • Advertisement
getoutofmycar

OpenGL Understanding data flow in a multi-threaded render pipeline

Recommended Posts

I'm having some difficulty understanding how data would flow or get inserted into a multi-threaded opengl renderer where there is a thread pool and a render thread and an update thread (possibly main). My understanding is that the threadpool will continually execute jobs, assemble these and when done send them off to be rendered where I can further sort these and achieve some cheap form of statelessness. I don't want anything overly complicated or too fine grained,  fibers,  job stealing etc. My end goal is to simply have my renderer isolated in its own thread and only concerned with drawing and swapping buffers. 

My questions are:

1. At what point in this pipeline are resources created?

Say I have a

class CCommandList
{
   void SetVertexBuffer(...);
   void SetIndexBuffer(...);
   void SetVertexShader(...);
   void SetPixelShader(...);
}

borrowed from an existing post here. I would need to generate a VAO at some point and call glGenBuffers etc especially if I start with an empty scene. If my context lives on another thread, how do I call these commands if the command list is only supposed to be a collection of state and what command to use. I don't think that the render thread should do this and somehow add a task to the queue or am I wrong?

Or could I do some variation where I do the loading in a thread with shared context and from there generate a command that has the handle to the resources needed.

 

2. How do I know all my jobs are done.

I'm working with C++, is this as simple as knowing how many objects there are in the scene, for every task that gets added increment a counter and when it matches aforementioned count I signal the renderer that the command list is ready? I was thinking a condition_variable or something would suffice to alert the renderthread that work is ready.

 

3. Does all work come from a singular queue that the thread pool constantly cycles over?

With the notion of jobs, we are basically sending the same work repeatedly right? Do all jobs need to be added to a single persistent queue to be submitted over and over again?

 

4. Are resources destroyed with commands?

Likewise with initializing and assuming #3 is correct, removing an item from the scene would mean removing it from the job queue, no? Would I need to send a onetime command to the renderer to cleanup?

Edited by getoutofmycar

Share this post


Link to post
Share on other sites
Advertisement

First off, welcome to GameDev.

A lot of your questions have super open ended answers and they all depend on how your engine works and how you designed it to work.

A good read about the flow of how a modern 3D engine works is found in these blog posts about the Autodesk Stingray Engine (now discontinued I believe): http://bitsquid.blogspot.com/2017/02/stingray-renderer-walkthrough.html

It's a general great blog, reading over older posts wouldn't hurt either.

Another good read, though it's mostly for Direct3D 11/12 and Vulkan (but the concepts are sound and would work for OpenGL) if I recall correctly is: 

As this sounds like it is your first go at a multithreaded engine, you're quite likely going to make design mistakes and likely refactor or restart several times, which is fine.  It's part of learning and the best way to learn is from ones own mistakes.

Share this post


Link to post
Share on other sites

Thanks for the welcome Mike! I agree they are open ended as you deduced. I have a very vague idea of what to do but new to the threading bits so wouldn't mind blindly following anyone's working approach until it sits with me and I can experiment. I will give those articles a read, I've also read the excellent high level ones over on the molecular blog but the inner workings are lost on me. 

Share this post


Link to post
Share on other sites

Thanks. I've actually watched these and more, Naughty Dog's one and others as well, but they are all still high level overviews that eventually wind down to stuff about atomics, actors, how to scale and none really delve into the intrinsics that I ask about but will no doubt be useful once I can get this running properly. Stuff like bgfx seems like what I would want but is a bit too dense to pick apart with all the backend stuff and whatnot.

Maybe my skull is extra thick, but none of my questions are answered in these. As I said before I do have a general idea of what to do, I'm more looking for actual implementation details mostly around how to handle resources, even pseudocode would be fine. I'm still reading the articles you linked above and will finish them tonight, hopefully I can glean anything from them to translate to code.

Edited by getoutofmycar

Share this post


Link to post
Share on other sites

The easiest way to get started is with a single "main thread" and an unstructured job system. The API for that might look as simple as:

void PushJob( std::function<void()> );

From your main thread, whenever you have a data-parallel workload, you can then farm it off to some worker threads:

//original
vector<Object*> visible;
for( int i=0; i!=objects.size(); ++i )
  if( IsVisible(camera, objects[i]) )
    visible.push_back(objects[i]);

//jobified for a split into 4 jobs:
#define NUM_THREADS 4
vector<Object*> visiblePerThread[NUM_THREADS];
Atomic32 jobsComplete[NUM_THREADS] = {0};
for( int t=0; t!=NUM_THREADS; ++t )
{
  vector<Object*>& visible = visiblePerThread[t];
  Atomic32& jobComplete = jobsComplete[t];
  int minimumWorkPerJob = 64;//dont bother splitting up workloads smaller than some amount
  int workPerThread = max( minimumWorkPerJob, numObjects/NUM_THREADS );
  //calculate a range of the data set for each thread to consume
  int start = workPerThread * t;
  int end = workPerThread * (t+1);
  start = min(start,numObjects);
  end = min(end,numObjects);
  if( start == end )//if nothing for this thread to do, just mark as complete instead of launching
    jobComplete = 1;
  else//push this functor into the job queue
    PushJob([&objects, &camera, &visible, start, end]()
    {
      for( int i=start; i!=end; ++i )
        if( IsVisible(camera, objects[i]) )
          visible.push_back(object);
      jobComplete = 1;
    });
}
//at some point before "visible" is to be used:
//use one thread to join all the results into a single list
for( int t=1; t!=NUM_THREADS; ++t )
{
  //block until the job is complete
  BusyWaitUntil( [&](){ jobsComplete[t] == 1; } );
  //append result set [i] onto result set [0]
  visiblePerThread[0].insert(visiblePerThread[0].end(), visiblePerThread[i].begin(), visiblePerThread[i].end());
}
vector<Object*>& visible = visiblePerThread[0];

This simple job API is easy to use from anywhere, but adds some extra strain to its users. e.g. above, the user of the Job API needs to (re)invent their own way of figuring out that a job has finished each time. In this example, the user makes an atomic integer for each job, which gets set to 1 when the job is finished. The main thread can then busy wait (very bad!) until these integers change from 0 to 1.

In a fancier job system, PushJob would return some kind of handle, which the main thread could pass into a "WaitUntilJobIsComplete" type function.

This is the basics of how game engines spread their workloads across any number of threads these days. Once you're comfortable with these basic job systems, the very fancy ones use pre-declared jobs structures and pre-scheduled graphs, rather than the on-demand, ad-hoc "push anything, anytime" structure, above.

The other paradigm is having multiple "main" threads -- e.g. a game update thread and a rendering thread. This is basically just basic message passing with one very big message -- the game state required by the renderer.

Going off this bare-bones/basic job system, to answer your questions --
1. Probably on the main thread, maybe on a job if it's parallelisable work.
2. Above I used simple atomic flags. If you're using a nice threading API, then semaphores might be a safer choice.
3. Yes.
4. Objects are not persistent in the job queue -- the main thread pushes transient algorithms into the job queue, so there's nothing to remove. The logic is the same as a single-threaded game.

With a fancier job system, the answers could/would be different :)

17 hours ago, getoutofmycar said:

I would need to generate a VAO at some point and call glGenBuffers etc especially if I start with an empty scene.

GL is the worst API for this. In D3D11/D3D12/Vulkan, resource creation is free-threaded (creation of textures/buffers/shaders/states can be done from any thread), and in D3D12/Vulkan you can also use many threads to record actual state-setting/drawing commands (D3D11 can too, but you get no performance boost from it, generally). It's probably worthwhile to do all your your GL calls on a single "main rendering thread", rather than trying to make your own multi-threaded-command-queue wrapper over GL.

Nonetheless, the entire non-GL / non-D3D part of your renderer can still be multi-threaded. That involves preparing drawable objects, traversing the scene, culling/visibility, sorting objects, etc.... In a D3D12/Vulkan version, you can also multi-thread the actual submission of drawable objects to the API.

Share this post


Link to post
Share on other sites

Thank you, this is some great affirmation on my shaky logic. Your talk was what actually really inspired me to try my hand at this being the most approachable for a threading newbie. I think one of my misconceptions was about how I would be blocking the rendering thread when doing any allocation. Going to finish up my simple spec and tackle this over the weekend.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Advertisement
  • Advertisement
  • Popular Tags

  • Similar Content

    • By nOoNEE
      i am reading this book : link
      in the OpenGL Rendering Pipeline section there is a picture like this: link
      but the question is this i dont really understand why it is necessary to turn pixel data in to fragment and then fragment into pixel could please give me a source or a clear Explanation that why it is necessary ? thank you so mu
       
       
    • By JoAndRoPo
      While going through a Game Design Document Template, I came across this heading - Core Game Loop & Core Mechanics Loop. 
      What's the difference? Can you provide some examples of an existing game?  Suppose if I am including these topics in a Game Design Document, how should I explain it so that my team can understand? 
    • By Inbar_xz
      I'm using the OPENGL with eclipse+JOGL.
      My goal is to create movement of the camera and the player.
      I create main class, which create some box in 3D and hold 
      an object of PlayerAxis.
      I create PlayerAxis class which hold the axis of the player.
      If we want to move the camera, then in the main class I call to 
      the func "cameraMove"(from PlayerAxis) and it update the player axis.
      That's work good.
      The problem start if I move the camera on 2 axis, 
      for example if I move with the camera right(that's on the y axis)
      and then down(on the x axis) -
      in some point the move front is not to the front anymore..
      In order to move to the front, I do
      player.playerMoving(0, 0, 1);
      And I learn that in order to keep the front move, 
      I need to convert (0, 0, 1) to the player axis, and then add this.
      I think I dont do the convert right.. 
      I will be glad for help!

      Here is part of my PlayerAxis class:
       
      //player coordinate float x[] = new float[3]; float y[] = new float[3]; float z[] = new float[3]; public PlayerAxis(float move_step, float angle_move) { x[0] = 1; y[1] = 1; z[2] = -1; step = move_step; angle = angle_move; setTransMatrix(); } public void cameraMoving(float angle_step, String axis) { float[] new_x = x; float[] new_y = y; float[] new_z = z; float alfa = angle_step * angle; switch(axis) { case "x": new_z = addVectors(multScalar(z, COS(alfa)), multScalar(y, SIN(alfa))); new_y = subVectors(multScalar(y, COS(alfa)), multScalar(z, SIN(alfa))); break; case "y": new_x = addVectors(multScalar(x, COS(alfa)), multScalar(z, SIN(alfa))); new_z = subVectors(multScalar(z, COS(alfa)), multScalar(x, SIN(alfa))); break; case "z": new_x = addVectors(multScalar(x, COS(alfa)), multScalar(y, SIN(alfa))); new_y = subVectors(multScalar(y, COS(alfa)), multScalar(x, SIN(alfa))); } x = new_x; y = new_y; z = new_z; normalization(); } public void playerMoving(float x_move, float y_move, float z_move) { float[] move = new float[3]; move[0] = x_move; move[1] = y_move; move[2] = z_move; setTransMatrix(); float[] trans_move = transVector(move); position[0] = position[0] + step*trans_move[0]; position[1] = position[1] + step*trans_move[1]; position[2] = position[2] + step*trans_move[2]; } public void setTransMatrix() { for (int i = 0; i < 3; i++) { coordiTrans[0][i] = x[i]; coordiTrans[1][i] = y[i]; coordiTrans[2][i] = z[i]; } } public float[] transVector(float[] v) { return multiplyMatrixInVector(coordiTrans, v); }  
      and in the main class i have this:
       
      public void keyPressed(KeyEvent e) { if (e.getKeyCode()== KeyEvent.VK_ESCAPE) { System.exit(0); //player move } else if (e.getKeyCode()== KeyEvent.VK_W) { //front //moveAmount[2] += -0.1f; player.playerMoving(0, 0, 1); } else if (e.getKeyCode()== KeyEvent.VK_S) { //back //moveAmount[2] += 0.1f; player.playerMoving(0, 0, -1); } else if (e.getKeyCode()== KeyEvent.VK_A) { //left //moveAmount[0] += -0.1f; player.playerMoving(-1, 0, 0); } else if (e.getKeyCode()== KeyEvent.VK_D) { //right //moveAmount[0] += 0.1f; player.playerMoving(1, 0, 0); } else if (e.getKeyCode()== KeyEvent.VK_E) { //moveAmount[0] += 0.1f; player.playerMoving(0, 1, 0); } else if (e.getKeyCode()== KeyEvent.VK_Q) { //moveAmount[0] += 0.1f; player.playerMoving(0, -1, 0); //camera move } else if (e.getKeyCode()== KeyEvent.VK_I) { //up player.cameraMoving(1, "x"); } else if (e.getKeyCode()== KeyEvent.VK_K) { //down player.cameraMoving(-1, "x"); } else if (e.getKeyCode()== KeyEvent.VK_L) { //right player.cameraMoving(-1, "y"); } else if (e.getKeyCode()== KeyEvent.VK_J) { //left player.cameraMoving(1, "y"); } else if (e.getKeyCode()== KeyEvent.VK_O) { //right round player.cameraMoving(-1, "z"); } else if (e.getKeyCode()== KeyEvent.VK_U) { //left round player.cameraMoving(1, "z"); } }  
      finallt found it.... i confused with the transformation matrix row and col. thanks anyway!
    • By Lewa
      So, i'm currently trying to implement an SSAO shader from THIS tutorial and i'm running into a few issues here.
      Now, this SSAO method requires view space positions and normals. I'm storing the normals in my deferred renderer in world-space so i had to do a conversion and reconstruct the position from the depth buffer.
      And something there goes horribly wrong (which has probably to do with worldspace to viewspace transformations).
      (here is the full shader source code if someone wants to take a look at it)
      Now, i suspect that the normals are the culprit.
      vec3 normal = ((uNormalViewMatrix*vec4(normalize(texture2D(sNormals, vTexcoord).rgb),1.0)).xyz); "sNormals" is a 2D texture which stores the normals in world space in a RGB FP16 buffer.
      Now i can't use the camera viewspace matrix to transform the normals into viewspace as the cameras position isn't set at (0,0,0), thus skewing the result.
      So what i did is to create a new viewmatrix specifically for this normal without the position at vec3(0,0,0);
      //"camera" is the camera which was used for rendering the normal buffer renderer.setUniform4m(ressources->shaderSSAO->getUniform("uNormalViewMatrix"), glmExt::createViewMatrix(glm::vec3(0,0,0),camera.getForward(),camera.getUp())//parameters are (position,forwardVector,upVector) ); Though i have the feeling this is the wrong approach. Is this right or is there a better/correct way of transforming a world space normal into viewspace?
    • By HomeBrewArcana
      Hey All,
      I'm looking to get into the gaming industry. I've skirted around the idea for a long time, always thinking that I couldn't do it. I've finally decided to take the plunge.
       
      My question is whether it's worth going to school for game design/coding etc. I've been writing content for paper games for a while, and have a good idea of story and some basic design. But I have next to no technical know how. 
       
      My instinct is that such things can be learned with a lot of practice, video tutorials, and more practice. I've also heard that a degree is not really that important, since you get hired based on your portfolio/prototypes. Why not just make the games?
      But won't a degree help with contacts and mentoring--I'm not a great networker.
       
      Of course, it'll plunge me into more debt, but...
      If anyone has advice, let me know. Also any idea of a program to start with: Game Maker, Unity, Godot, Construct, Stencyl--I've heard good things about them all, so much so that I don't know which would be best to start with!
       
      Thanks
    • By HawkDeath
      Hi,
      I'm trying mix two textures using own shader system, but I have a problem (I think) with uniforms.
      Code: https://github.com/HawkDeath/shader/tree/test
      To debug I use RenderDocs, but I did not receive good results. In the first attachment is my result, in the second attachment is what should be.
      PS. I base on this tutorial https://learnopengl.com/Getting-started/Textures.


    • By norman784
      I'm having issues loading textures, as I'm clueless on how to handle / load images maybe I missing something, but the past few days I just google a lot to try to find a solution. Well theres two issues I think, one I'm using Kotlin Native (EAP) and OpenGL wrapper / STB image, so I'm not quite sure wheres the issue, if someone with more experience could give me some hints on how to solve this issue?
      The code is here, if I'm not mistaken the workflow is pretty straight forward, stbi_load returns the pixels of the image (as char array or byte array) and you need to pass those pixels directly to glTexImage2D, so a I'm missing something here it seems.
      Regards
    • By Hashbrown
      I've noticed in most post processing tutorials several shaders are used one after another: one for bloom, another for contrast, and so on. For example: 
      postprocessing.quad.bind() // Effect 1 effect1.shader.bind(); postprocessing.texture.bind(); postprocessing.quad.draw(); postprocessing.texture.unbind(); effect1.shader.unbind(); // Effect 2 effect2.shader.bind(); // ...and so on postprocessing.quad.unbind() Is this good practice, how many shaders can I bind and unbind before I hit performance issues? I'm afraid I don't know what the good practices are in open/webGL regarding binding and unbinding resources. 
      I'm guessing binding many shaders at post processing is okay since the scene has already been updated and I'm just working on a quad and texture at that moment. Or is it more optimal to put shader code in chunks and bind less frequently? I'd love to use several shaders at post though. 
      Another example of what I'm doing at the moment:
      1) Loop through GameObjects, bind its phong shader (send color, shadow, spec, normal samplers), unbind all.
      2) At post: bind post processor quad, and loop/bind through different shader effects, and so on ...
      Thanks all! 
    • By phil67rpg
      void collision(int v) { collision_bug_one(0.0f, 10.0f); glutPostRedisplay(); glutTimerFunc(1000, collision, 0); } void coll_sprite() { if (board[0][0] == 1) { collision(0); flag[0][0] = 1; } } void erase_sprite() { if (flag[0][0] == 1) { glColor3f(0.0f, 0.0f, 0.0f); glBegin(GL_POLYGON); glVertex3f(0.0f, 10.0f, 0.0f); glVertex3f(0.0f, 9.0f, 0.0f); glVertex3f(1.0f, 9.0f, 0.0f); glVertex3f(1.0f, 10.0f, 0.0f); glEnd(); } } I am using glutTimerFunc to wait a small amount of time to display a collision sprite before I black out the sprite. unfortunately my code only blacks out the said sprite without drawing the collision sprite, I have done a great deal of research on the glutTimerFunc and  animation.
  • Advertisement
  • Popular Now

  • Forum Statistics

    • Total Topics
      631349
    • Total Posts
      2999473
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!