• Advertisement
Sign in to follow this  

Ways of Buffering Data to be Rendered?

This topic is 709 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I am knee deep in a ~new~ project right now. I have a threaded architecture and I am on the fence about how to implement buffering. My engine uses OpenGL, which already implements a buffer I know, so what it accomplishes is not what I am trying to accomplish.

 

My goal is to allow my Update and Draw threads to run in parallel with as little lock time as possible.

So what I came up with as a step towards that was BufferedObjects. These are POD objects which hold all the data required for rendering an individual object {Members: asset, position, rotation, alpha, scale, etc}.

 

I also have an interface called a GameModule which is used by the engine architecture to run the game's systems. {Methods: Init, Deinit, Process, Update, Buffer, Output}.

 

I originally planned to implement a routine in a Buffer method which would create BufferedObjects out of GameObjects and thus allow the Draw routine to avoid locking the data. Then I realized since the Buffer calls were housed with the Draw calls that I wouldn't be saving anything. So I moved the Buffer call to the Update's thread. So now my two main threads are: Update:{ methods: Process, Update, Buffer } and Draw:{ methods: Output }

 

This project is of my own spare time; time costs are unimportant. This is important as I am nearly to the dilemma I have.

 

Okay so, I could create buffered objects inside my Buffer call. This prevents the need for locking during the Draw calls. As I was thinking about it I realized that this makes the update thread need to do most of the work. I wonder if I could balance things better(I thought to myself). So I considered adding a Prebuffer and moving the Buffer call back to where it was originally. The Prebuffer could replicate essential data, and the buffer could convert it to BufferedObjects which simplify the Draw Routine a great deal. Or I could just move the Buffer back and replicate essential data and create a less trivial draw routine.

 

My dilemma is I don't know whether I should go Route A, or Route B. Then of course there is always a glimmer of Routes C through Z. All I know is locking an entire thread while another executes is a bad solution, which is the only solution to the issue I've ever implemented thus far.

 

edit:
I should note that the best solution is the one which should scale the best.

To give a very rough idea of the volume of data:

  • 2d game
  • <200 texture slices visible in a frame

In reality, I may implement a less than best solution and mainly just want some extra heads to help identify landmarks.

Edited by coope

Share this post


Link to post
Share on other sites
Advertisement

>>  All I know is locking an entire thread while another executes is a bad solution

 

when its time to render, lock required update data, copy it, unlock. then render as usual, using your copy of the current game state.

 

i don't think anything else is going to be faster.  you need all that required game state data to render.   not doing it all at once just adds more locks.

 

DOD layout of the data to be copied will minimize lock times.

 

you have two threads, you want them to both scream. one needs info from the other from time to time. so you lock it as infrequently as possible, and get the info as quickly as possible, then immediately unlock. that's about the best you can do.

 

the other option is some sort of intermediate buffer, where update places a copy of its data when it changes, and render grabs the current data when it renders. but then update must lock and unlock the buffer, as must render.  but this does mean update wont stall while waiting for render to copy its data. of course you could still gets stalls if update was ready to write to the buffer, but render was still reading.  update will most likely be orders of magnitude faster than render, so update stalled while render reads the buffer is much more likely.

 

in the end, you go with what has the fewest locks and the fastest data transfer, for lowest overall execution time.

 

and you'll probably want to do as much on the update side as possible, since render runs slower.  OTOH, render is the one who needs the info, so odds are that's not possible without just slowing things down overall.

Edited by Norman Barrows

Share this post


Link to post
Share on other sites

the other option is some sort of intermediate buffer, where update places a copy of its data when it changes, and render grabs the current data when it renders. but then update must lock and unlock the buffer, as must render. but this does mean update wont stall while waiting for render to copy its data. of course you could still gets stalls if update was ready to write to the buffer, but render was still reading

You can always double or triple-buffer your data to get out of all these issues, at the expense of additional memory.

Share this post


Link to post
Share on other sites

 

the other option is some sort of intermediate buffer, where update places a copy of its data when it changes, and render grabs the current data when it renders. but then update must lock and unlock the buffer, as must render. but this does mean update wont stall while waiting for render to copy its data. of course you could still gets stalls if update was ready to write to the buffer, but render was still reading

You can always double or triple-buffer your data to get out of all these issues, at the expense of additional memory.

 

>Norman

I am skeptical that my draw routine will take even 1 magnitude longer to execute. Even considering the API calls need to set color values to thousands of individual pixels each pass, it at least has the benefit of multiple hardware components working in tandem. The update needs to process input state, update positions, update sprite states, perform collision checks, queue audio playback, and possibly other things I can't think of. I know rendering is slower executing line for line, but there are vastly more lines in the update.

 

I will probably perform a benchmark and test out each thread at some point, find lows, highs, and calculate averages. That will obviously take place when all is said and done though, so the results will be biased to a potentially optimized implementation.

 

 

>Swift

The more buffers I add the further behind the rendering will be. Any idea what the limit is? I've played games with triple buffers which felt unresponsive, is that just par for the course or a consequence of a poor implementation?

 

 

--

Should I just copy my update data whole sale? I'm sort of taken by the simplicity of my "BufferedObjects" they have no extraneous information, I could perhaps even make a constructor which takes a game object as a parameter, or a method which returns a buffered object in the game objects. What do you think?

Share this post


Link to post
Share on other sites

I've been trying to do some reading into triple buffers. Looking for anything about implementation or design, but I'm churning up nothing terribly useful.

 

The nicest thing I've found so far showed a quickie implementation using macros which I nearly barfed at.

 

So I am curious if anybody knows of any *modern* resources about this stuff. A blog article, a technical spec, a flowchart, a small open source game. I'd take a book if it can be read online, unless somebody wants to help a guy out and willing to ship it here ^_^

 

Might just need to sit down and make a decision even if I gotta back track it later.

Share this post


Link to post
Share on other sites

More buffers need more time, which delays the player.  PS2 hit about the limit that players are willing to tolerate, having effectively four buffers before hitting the players is a LONG time, and programmers needed to do quite a lot to keep the games feeling smooth.

 

The nicest thing I've found so far showed a quickie implementation using macros which I nearly barfed at. .. So I am curious if anybody knows of any *modern* resources about this stuff

 

The process is not that hard to do and doesn't require any macros or anything.

 

In both D3D12 and Vulkan it is configured with your swap chain settings, which establishes a queue of buffers. The D3D12 stuff is documented on MSDN, the Vulkan stuff is buried in the specs, search for vkCreateSwapchainKHR to start the search but it basically works the same.

 

 

Generally with triple buffering you have the currently displayed buffer, the fully-drawn-but-pending buffer, and the one you are filling up. It provides a slight time buffer beyond what double-buffering gives you, but typically at a cost of an additional ~16 ms lag, putting the game 45-60 ms behind the player's input. Personally I don't think it is worth the benefit, you gain a small amount of processing smoothing at the expense of display lag, but your game might be different.

 

For competitive network games and twitch games, when you couple it with network lag often that adds another delay the additional buffer lag is often unacceptable to competitive players. They're the same players who will turn off vsync and prefer torn images with the split-second information benefits over the visual quality.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement