Jump to content

  • Log In with Google      Sign In   
  • Create Account

Architecture for multithreading


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
11 replies to this topic

#1 Trapper Zoid   Crossbones+   -  Reputation: 1370

Like
0Likes
Like

Posted 09 December 2009 - 01:13 PM

I'm working on a few quick game projects, and I'd like to use this opportunity to experiment with some new architecture approaches. With my projects being generally quite small, I've previously never had a need to look into multithreading, but with multiple cores being ubiquitous these days it feels like a large gap in my knowledge. I'd like to steer my development direction towards writing multiple-thread, multiple-core friendly code and I've read up on the basic nuts and bolts (starting threads, mutexes etc.); I'm just a bit lost as to the steps I should go in to plan the high-level architecture to be multithread friendly. My current single-threaded approach to a game system would be to create a kernel acting as the main game loop, where the main tasks that need to run every loop get registered and called every cycle. I'd have tasks for world update, rendering, input polling, updating the timers and so on, registered to the kernel and somewhat independent from each other, although they do tend to share some information through what are effectively global data structure (such as things like the latest time update delta). They also tend to share some code elements such as the logging system. From what I gather an easy way to multithread this would be to split the tasks into some large threads, for example:
  • A main thread for gameplay elements
  • A render thread
  • Maybe a resource management thread for files, textures, memory etc.
  • And technically maybe threads for some smaller subsystems like audio, which the libraries I use most likely handle invisibly to me
This way there are only two large threads, the gameplay one and the render one, which can communicate through a shared data block of renderable elements, which the gameplay thread writes to and the renderer reads, which I can keep a close eye on to avoid thread issues. This seems pretty straightforward to do, but it doesn't sound very scalable to more advanced threading systems and it doesn't solve issues like a shared logging system I'm guessing my stumbling point is understanding the best ways threads in a game system can safely communicate with each other, and what that means for how to structure the tasks in a game system into an architecture that makes sense with those limitations. I'd be grateful if someone could explain at a high-level how they'd personally go about structuring a multithread friendly system and the interfaces they'd use between threads. Maybe with a bit of discussion that the nature of the problem will "click" with me. Thanks.

Sponsor:

#2 Dragon_Strike   Members   -  Reputation: 264

Like
0Likes
Like

Posted 10 December 2009 - 05:35 AM

first of all i dont think your should be thinking in threads but instead in "tasks"...

in my project i use Intel Thread Building Blocks... my system works basicly like this...

i have different subsystems such as Rendering, Simulation, Physics, Animation. Every subsystem have a CreateMainTask method which is called at their own frequency by a SubSystemScheduler... for example the Renderer::CreateMainTask is called every 1/60 second while Physics::CreateMainTask i called every 1/30 second...

every subsystem have their own copy of the state they need... for example rendering has matrices for every object while simulation has positions... every subsystem can listen for changes in specific states... for example if the position of some object is changed in simulation then a "UpdateMessage" is sent to the rendering subsystem through some form of queue and when the rendering subsystem runs its task the first thing it does is to check every update in the queue and updates its state...

then in every subsystem all the work is divided into smaller tasks as far as possible... which is great because the work is most efficiently divided between threads....

| Step1 | Step2
---------------------------------------------
Thread1 | Renderer1 | Simulation1 | Renderer2
--------------------------------------------
Simulation is divided!
-------------------------------------------
Thread2 | Physics1 | Simulation1 | Physics2
---------------------------------------------

This is basicly what intel do in their "Smoke Demo" i encourage you to take a look at it... they provide source code... i dont have a link but it should be easy to find through google

#3 all_names_taken   Members   -  Reputation: 104

Like
0Likes
Like

Posted 10 December 2009 - 07:33 AM

My experience is that WinAPI, XLib and many other such basic services require that you call them from the same thread. This makes it really difficult to separate at least input and graphics. Physics may be offloaded to a separate kernel though, and here Dragon_Strike's suggestion on thinking in tasks is very useful. A lot of libraries do the multithreading stuff for you already so you don't really need to bother about pulling out any specific tasks into separate threads unless you write something on your own. For example I'm considering (if performance demands it) to load textures from disk to memory by a separate worker thread, then use the main render thread to upload them to the GPU (because OpenGL enforces it), but that stuff can be added at the end, providing the required synchronization primitives. So in short, you don't need to think about multithreading your game from start, you can add that in later if you architecture follows good practice in general. Just my two cents, I'm sure there will be a bunch of people who disagree.

#4 phantom   Moderators   -  Reputation: 7558

Like
0Likes
Like

Posted 10 December 2009 - 08:49 AM

Quote:
Original post by Dragon_Strike
first of all i dont think your should be thinking in threads but instead in "tasks"...

in my project i use Intel Thread Building Blocks...


I strongly second this; I used Intel's TBB when I was developing a CPU based particle system and its really easy to up get up and running.

And 'task based' is certainly the way forward; writing directly for 'threads' has the downside that as soon as extra cores/threads are added you stop using as much of the CPU as you can.

While you'll probably need some details on threads/cores still to balence things this method if much more scalable.

Taking your examples Trapper Zoid, aside from the main and render thread you are going to be mostly idle on the others (loading should be async so aside from processing won't be doing much, sound doesn't take much either) so even on a quad core around today you'll be using maybe 50% of your time most of the time, with the rest mosting being idle.

#5 Trapper Zoid   Crossbones+   -  Reputation: 1370

Like
0Likes
Like

Posted 10 December 2009 - 12:18 PM

Thanks for the replies. Multicore support is looking a bit more involved than I thought; I might have to make it the entire goal of my next small project.

Quote:
Original post by Dragon_Strike
first of all i dont think your should be thinking in threads but instead in "tasks"...

When I'm thinking in threads, I'm thinking along the lines of "here's a bunch of processes that are running in parallel to each other". Does thinking in tasks mean something along the lines of "here's a small task I need solved, go take this data and go off and work on it, then come back with results"? That's the sort of problems I'd like to use multiple threads to solve when I get to a project that requires a lot of CPU computation for either physics or AI.

Intel Thread Building Blocks looks useful; I'll have a look at that, thanks. I'm still open to looking at new programming languages if they're useful for the problem. I'm fond of plain C myself, but in its raw form it's not especially designed for parallel applications.

Quote:
Original post by phantom
Taking your examples Trapper Zoid, aside from the main and render thread you are going to be mostly idle on the others (loading should be async so aside from processing won't be doing much, sound doesn't take much either) so even on a quad core around today you'll be using maybe 50% of your time most of the time, with the rest mosting being idle.

I currently don't have a specific project in mind that needs multiple cores, but eventually I'd like to work on projects that require a lot of AI computation and/or physics. I'm sure I can find ways to split those tasks up into chunks that can be worked on in parallel that could use as many cores as can be spared. But since I've never used more than one thread myself, I'm in the dark about how I would actually do that and was wondering as to whether splitting up the main components of my current, not especially demanding project would be good practice. Thinking about it probably won't be, as the tricky part is the slicing up of the tasks in the project domain, not the running of a handful of main subsystems.

#6 phantom   Moderators   -  Reputation: 7558

Like
0Likes
Like

Posted 10 December 2009 - 12:37 PM

I wouldn't have said it would be bad practise to split your current engine/game up into a few main threads; if nothing else it'll introduce you to the pitfalls of data sharing between threads and synronisation issues.

However, as I'm sure you realise, just gluing threads into a program isn't going to work as well as designing from the ground up with threads/tasks in mind.

With regards to languages, TBB is very much C++ and MS has something coming which is along the same lines (and basically a swap out replacement as Intel and MS came to an agreement to make their two methods very much compatible).

.Net is also gaining various parallel constructs in the up and coming release which means that C#, F# etc will be able to head in the same direction.

End of the day however you can do far worse then learning about the problems by taking a known working design and splitting it up a bit; experiance is king after all.

(As a side note, I'm a fan of the design where you have everything update its logic at once, then run a 2nd 'sync' step. The first update use everyone elses last frame to update internal state, the sync step then pushes your internal state into the external state for the next update. I'm still figuring out how to map the graphics updates into this, probably involving a 'graphics only' version of the data or something, but it does side step many data sharing/lock issues at the expense of an extra 'pass' over the data)

#7 Hodgman   Moderators   -  Reputation: 31798

Like
0Likes
Like

Posted 10 December 2009 - 12:37 PM

I use a task system, but didn't use TBB :/

I have a constant list of tasks to compute each frame (as opposed to submitting a new list of tasks each frame). Tasks are things like "update entities", "poll inputs", "render scene" -- basically the things that usually go in your single-threaded game loop.

Tasks can specify a list of dependent tasks. A serial schedule is created based on these dependencies. Tasks must be appear in the schedule after their dependencies, and preferably as far after as possible (to avoid waiting on a dependency to finish).

I then take this schedule, and give it to every thread in my thread pool (on thread per core). Each thread executes every task, in the order specified by the schedule. Before executing a task, it's dependencies are checked to see if they're complete, and if not, it busy-waits.

The task interface has two functions:
virtual uint Size() const;//how many discrete bits of work are there to do?
virtual void Execute( uint begin, uint end );//execute a range of the task

Although every thread executes every task, each thread actually executes a unique, non-overlapping range of each task.
If a task is single-threaded, then it can just return 1 from the Size method, and then only one thread will be able to execute it.
The union of the ranges used by each thread must span the entire range (0 to Size()-1 inclusive).

For two tasks to communicate with each other, then the "consumer" task simply needs to specify the "producer" task as a dependency. The dependency system ensures they won't be executing at the same time. In the best case scenario (if the schedule is arranged nicely with enough intermediate tasks between dependant tasks), waiting on dependencies won't often occur (which is why I use a simple busy-wait).

The only mutexes you need are to protect the "isTaskCompleted" flags (could be implemented as a counter -- when the counter reaches numThreads, the task is actually complete). For better performance, instead of a mutex, you can use an atomic increment.

This system allows for task-level concurrency (where multiple threads are working on separate tasks) and data-level concurrency (where multiple threads are working on different parts of the same task), and allows you to easily share data between tasks without locks or race conditions.

Quote:
Original post by phantom
(As a side note, I'm a fan of the design where you have everything update its logic at once, then run a 2nd 'sync' step. The first update use everyone elses last frame to update internal state, the sync step then pushes your internal state into the external state for the next update. I'm still figuring out how to map the graphics updates into this, probably involving a 'graphics only' version of the data or something, but it does side step many data sharing/lock issues at the expense of an extra 'pass' over the data)
Yep, I like this paradigm too.
My entity-system is built on top of the above-described task system. It's got two main tasks (These task's Size() func returns the number of entities to be processed, so each entity can be updated in parallel to each other). Entities double buffer their internal state -- public "getter" functions return data from the buffer, not the actual 'live' variables.
* Update task - Entities do their thing, and modify their internal/private/non-buffered state. During this task they can only read the public/buffered state of other entities.
* Commit task - During this task, each entity copies it's private state into it's public buffer. This task is dependant on the Update task.

To avoid complex interactions taking multiple frames to play out, I actually execute multiple "task frames" per "graphics/gameplay frame". For example, the Update/Commit tasks might have to run 10 times before I'm actually ready to run the Render or Networking tasks.

[Edited by - Hodgman on December 10, 2009 6:37:30 PM]

#8 elFarto   Members   -  Reputation: 206

Like
0Likes
Like

Posted 10 December 2009 - 11:26 PM

Quote:
Original post by phantom
As a side note, I'm a fan of the design where you have everything update its logic at once, then run a 2nd 'sync' step.


I'm a fan of global state being transactional. Each entity's update process uses the latest global state and is not effected by state changing while it's running (ie. starting a transaction 'freezes' your view of the world). While this doesn't solve all the problem, and introduces live-locks, it seems to work quite well and it neatly side steps synchronisation issues (deadlocks).

My idea works better in an MMO environment where multiple threads updating a single entity are unlikely.

As for multi-threading, I prefer to have a thread pool and just throw everything at that. Need a texture loaded, create an object to do that and send it to the pool. Need a frame rendered. This is made trivial in Java thanks to the Executor and Runnable classes. You'll probably need some sort of priority system on the pool to make sure audio and rendering don't get delayed too much.

The thread pool and transactional entities work very well together as you don't have to worry about the effect one task may have on another.

Regards
elFarto

#9 KoichiSenada   Members   -  Reputation: 126

Like
0Likes
Like

Posted 11 December 2009 - 12:51 AM

I also recommend to check out Intel Smoke Demo to examine their multithreading idea.
Even Input and Graphics are separated there!
http://software.intel.com/en-us/articles/smoke-game-technology-demo/

#10 mrjones   Members   -  Reputation: 612

Like
0Likes
Like

Posted 11 December 2009 - 01:50 AM

I have also used mostly two step systems when updating world. About threading, take the recommendation and think of tasks instead. My own engine might not be very clean on these things, but here is how I do threading.

Technically I do have multiple threads running, but these are inside classes, doing work locally, I don't think of them globally at all. For example, I have network thread that waits for packets. The only places where I have to lock are when packet arrives (on the thread side) and when the packets queue (or message queue if you prefer) is accessed. For the rest of the engine it is seamless. Every method accessed by the thread is a private member of the threaded network class and no other classes have to worry about the fact that packets are actually waited for asynchronously.

The rendering/sound/physics systems are the same. Anything accessing the public members these systems doesn't even know that rendering is actually threaded. It might also not be, but it won't matter with regard to the rest of the engine.

#11 Hodgman   Moderators   -  Reputation: 31798

Like
0Likes
Like

Posted 11 December 2009 - 06:27 PM

Quote:
Original post by elFarto
Quote:
Original post by phantom
As a side note, I'm a fan of the design where you have everything update its logic at once, then run a 2nd 'sync' step.
I'm a fan of global state being transactional. Each entity's update process uses the latest global state and is not effected by state changing while it's running (ie. starting a transaction 'freezes' your view of the world). While this doesn't solve all the problem, and introduces live-locks, it seems to work quite well and it neatly side steps synchronisation issues (deadlocks).
Does this approach support deterministic simulations? Or would the random order of threaded updates cause the order of actions to be random as well?
With the 2-step approach all entities are working from the same 'snapshot' every update, so you can still enforce determinism.

It's not an essential feature, but is required by some games.

#12 elFarto   Members   -  Reputation: 206

Like
0Likes
Like

Posted 11 December 2009 - 09:48 PM

Quote:
Original post by Hodgman
Does this approach support deterministic simulations? Or would the random order of threaded updates cause the order of actions to be random as well?

No, it doesn't support deterministic simulations, since as you pointed out you can't control the order in which the update occur.

Regards
elFarto




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS