Jump to content

  • Log In with Google      Sign In   
  • Create Account

14 years ago on June 15th Gamedev.net was first launched! We want to thank all of you for being part of our community and hope the best years are ahead of us. Happy birthday Gamedev.net!

Simplest place to implement multithreading in my game engine?


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
22 replies to this topic

#1 EnigmaticProgrammer   Banned   -  Reputation: 141

Like
0Likes
Like

Posted 16 May 2012 - 07:18 PM

I've not used multithreading before but I think it's time I start. Where should I first try implementing it? I just need to start with something simple to wrap my head around the idea.

EDIT: Using C++

Edited by EnigmaticProgrammer, 17 May 2012 - 01:44 AM.


Sponsor:

#2 endless11111   Members   -  Reputation: 135

Like
0Likes
Like

Posted 16 May 2012 - 07:24 PM

you should not implement it into your engine until you have a firm grasp of it in case you encounter numerous bugs which you wont know how to fix because you haven`t experienced them before. although i am not experienced in multithreading, i have read numerous times to split the threads as logic, rendering, file input or output, events, and audio. basically to start just create a simple program to test out multithreading and create just 2 threads splitting logic and rendering to get a basis of what multithreading can do, then work on making a harder project like a basic game, then try and implement it into your engine. or you could just try to split logic, rendering, input, and opening files into different threads in your engine right away, but that`s not a good idea until you have the hang of it.

#3 Trienco   Members   -  Reputation: 1337

Like
0Likes
Like

Posted 16 May 2012 - 10:03 PM

I would stay away from splitting anything into threads that works on the same data. If you want separate threads for physics, rendering and AI you will just end up in synchronization hell and find yourself spamming mutexes all over the place (possibly to the point where it's running slower than without multi threading). Example: one thread renders the player and reads the position. It only has the x coordinate, as the physics thread updates the position. Now you end up with inconsistent x,y coordinates and draw the player in the wrong place.

The easiest way to add mulit threading is to

a) not do it on the lowest possible level. Use a library like Intel TBB that builds high level concepts (parallel execution, pipelines, etc.) on a worker thread pool
b) use it only when the data processed in different threads is not touched by more than one thread (AI updates, tests or calculations done on a ton of objects, etc.)

Then you can move on to background loading or when loading compressed data or information that requires a lot of processing after loading, you can split that work to several threads. You can even use the mentioned pipeline and have each step be handled by one or more threads (loading, unpacking, preprocessing). TBB will even be nice and balance things out for you, so you can be sure the most intensive step gets the most threads. For example, loading should probably not happen by more than one thread (though who knows how modern hard drives work around that), unpacking might end up with only 1-2 threads (it might be faster than loading, so more than one would be a waste), while the rest is busy processing the loaded data.

The best part: at no point will you have to create or synchronize threads on your own. You shouldn't worry about that before you got a good feel for how parallel execution can shake things up.
f@dzhttp://festini.device-zero.de

#4 ApochPiQ   Moderators   -  Reputation: 8419

Like
4Likes
Like

Posted 16 May 2012 - 11:03 PM

If you're combining the ideas of "simple" and "multithreading" you're already fighting a losing battle (in most of the popular contemporary languages at least).

The first rule of multithreading is that nothing is ever simple.


Pop quiz! Two threads run this code:

x = 1;
y = 2;
z = x + y;
x = 4;

What's the value of z?

If you answered "3" you are in for a world of nasty surprises. If you confidently answered "maybe 3 and maybe 6" you're in for even worse surprises.


Multithreaded concurrency is a horrible, terrible, evil thing and extremely difficult to get right. It's also not something you just retrofit into existing code bases. You need to design your code from day one to support concurrency or you will spend endless hours fighting deadlocks, livelocks, race conditions, starvation conditions, priority inversions, and a host of other nondeterministic and indeterminate bugs.

If you want to learn to do concurrent threads, here's what I suggest:

  • Pick a problem that is well known to be easy to parallelize
  • Use an existing library or framework for setting up threading and solve that problem
  • Learn about all the fun new bugs you get to fight along the way
  • Pick a problem that's a bit less parallel and start learning about synchronization and data flow
  • Learn about yet more fun bugs in the synchronization category
  • Write something medium-sized that requires threading, such as a concurrent engine design
  • For the record this should be at about the 2-3 year mark
  • Start investigating how your library of choice works, and see what you can implement yourself
  • Learn about all the nasty subtle bugs that your library has been hiding from you for all this time


I hope it doesn't sound too depressing and scary, but it should sound pretty depressing and scary. Multithreading is a tricky beast. Plan to commit a few years to mastering it if at all possible.

After all that time, you will probably come back to the same conclusion that many programmers do after struggling with threading for a while: it's very, very rarely worth it, and even when it is worth it, doing it right is hard. Unless you happen to be working on something that's really easy to parallelize, things get nightmarishly complex very fast, and things will seem utterly mysterious and baffling for a long time until you develop an intuition for how threads work.


Don't get me wrong, it's a great tool to have in your arsenal, but it should be approached with the proper reverence and caution. This isn't the kind of thing you can pick up from a couple of web tutorials and a textbook.

#5 Krohm   GDNet+   -  Reputation: 1813

Like
0Likes
Like

Posted 17 May 2012 - 01:07 AM

I've not used multithreading before but I think it's time I start.

The game I'm working on tops at 125fps on target machines using 80% of a core. On lower spec machines, it usually runs about 60 fps with ease. The game is simple. It will be a while before I'll need multithreading.
There's no such thing as "I think": there's the need for performance. Or not. Until you need the performance, just practice it outside of main code base.

#6 EnigmaticProgrammer   Banned   -  Reputation: 141

Like
0Likes
Like

Posted 17 May 2012 - 01:54 AM

Let me explain why I want to implement it. I have hopes of working for a local game studio and one of their prerequisites for the position I want is to have experience with multithreading. I don't need to implement it in every aspect of the engine but I do need to implement it to some degree in order to get the job.

#7 Hodgman   Moderators   -  Reputation: 14288

Like
0Likes
Like

Posted 17 May 2012 - 02:11 AM

I'd recommend starting by reading the entire effective concurrency series of articles.

#8 Ripiz   Members   -  Reputation: 521

Like
0Likes
Like

Posted 17 May 2012 - 02:59 AM

Pop quiz! Two threads run this code:

x = 1;
y = 2;
z = x + y;
x = 4;

What's the value of z?

If you answered "3" you are in for a world of nasty surprises. If you confidently answered "maybe 3 and maybe 6" you're in for even worse surprises.


I would guess 3 or 6, but seems I cannot reproduce that in artificial situation:
volatile int x = 0;
volatile int y = 0;
volatile int z = 0;

void mythread() {
   for(;;) {
	  x = 1;
	  y = 2;
	  z = x + y;
	  x = 4;
   }
}

void main() {
   unordered_map<int, bool> printedSoFar;

   volatile thread threads[] = {
      thread(mythread),
      thread(mythread)
   };

   for(;;) {
      int currentZ = z;
      auto &printed = printedSoFar[currentZ];
      if(!printed) {
         cout << currentZ << endl;
         printed = true;
      }
   }
}

The only thing it ever prints is 3. Using volatile because release build destroys loop in mythread(). Debug mode without volatile prints only 3 as well.

#9 Hodgman   Moderators   -  Reputation: 14288

Like
0Likes
Like

Posted 17 May 2012 - 03:53 AM

The only thing it ever prints is 3.

When I tested it, it printed 3 99.99% of the time, but not 100%, which just means that the race condition is very unlikely in my test circumstances... which doesn't mean anything; there's still a race condition present.
If you want to see your code print another value, you can force the potential bug to occur by manually freezing all the threads in your debugger and then advancing each thread one line at a time in the order that causes the race condition, such as:
Thread #1: x = 4;
Thread #2: z = x + y;

Also, I think part of ApochPiQ's point is that just by looking at that example code, you can't know what it's going to print. You've got to go ask other questions, such as whether the compiler will optimize out the seemingly redundant "read x from RAM" instruction when performing z = x + y (inhibited by volatile in your example), or where the memory fences are (which are a concept missing from C++98, making most threaded code completely "implementation defined behaviour"), or whether aligned word-sized writes to RAM are atomic and whether x/y/z are aligned (more details that are unspecified by C++).

This just goes to show that it's very easy to think that your multi-threaded code is "working", when in fact it's got a fatal flaw in it with a tiny chance of appearing at any time. Your code could work fine for a year, and then when you're about to ship your game, suddenly 1 in 5 QA tests begin failing, because some new change has influenced your thread timings and coincidentally caused an existing potential race-condition to start occurring.

This has happened to me before -- I've written and used a lock-free queue for 6 months without it ever failing the unit tests once. Suddenly, after 6 months of use, it failed the unit test due to a flaw in my logic (which only occurred under a very rare race condition). I've also been on a year-long project, where during final beta QA testing, the engine's asset loader (which was used successfully on the previous game) was reported to crash. Again, this long-time working piece of code had some tiny flaws in it's intra-thread communication, which hadn't shown up for years...

Shared memory code like the above should always be protected by some sort of synchronisation primitive, which allows you to logically reason about the use of shared memory.

Edited by Hodgman, 17 May 2012 - 04:03 AM.


#10 ApochPiQ   Moderators   -  Reputation: 8419

Like
0Likes
Like

Posted 17 May 2012 - 05:13 AM

I would guess 3 or 6, but seems I cannot reproduce that in artificial situation:

volatile int x = 0;
volatile int y = 0;
volatile int z = 0;

void mythread() {
   for(;;) {
	  x = 1;
	  y = 2;
	  z = x + y;
	  x = 4;
   }
}

void main() {
   unordered_map<int, bool> printedSoFar;

   volatile thread threads[] = {
      thread(mythread),
      thread(mythread)
   };

   for(;;) {
      int currentZ = z;
      auto &printed = printedSoFar[currentZ];
      if(!printed) {
         cout << currentZ << endl;
         printed = true;
      }
   }
}

The only thing it ever prints is 3. Using volatile because release build destroys loop in mythread(). Debug mode without volatile prints only 3 as well.


You've made three decisions which affect your testing:

  • You use volatile, which changes the semantics substantially
  • You use auto, which indicates at least a partially conformant C++11 compiler, which means it probably has the C++11 memory model which will affect how this repros
  • You're on a particular architecture, probably x86 or x64, with particular ordering semantics

Do that same code as part of a more complex test with longer code before and after it. Stagger the timings of the threads a bit so they have a higher chance of racing. Remove the volatile so the compiler doesn't read/write-fence the variables. Run on a CPU besides the one in your desktop/laptop. Turn on (or off) hyperthreading. Try compiling this with a C++98/03 compiler. Even worse, try it in a language that's not C++; I've seen similar races in JavaScript code of all things.

There are any number of things you can do to affect the repro conditions of this race. And as Hodgman very aptly noted, the fact that you don't see it is precisely what makes this dangerous: you can test it 100 times and get a false sense of security from the lack of visible bugs. But there is a bug in that situation, whether you can reliably get it to happen or not.

The fact that there are so many things that can affect the behavior of even this trivial snippet of code should indicate just how deep the multithreading rabbit hole goes.

#11 mrbastard   Members   -  Reputation: 1567

Like
0Likes
Like

Posted 17 May 2012 - 05:30 AM

If you want to get straight in learning with the C++11 threading model / libs, this book is pretty good - I bought it on a recommendation from Scott Meyer's blog, and haven't been disappointed.

Edited by mrbastard, 17 May 2012 - 05:30 AM.



#12 Antheus   Members   -  Reputation: 2369

Like
0Likes
Like

Posted 17 May 2012 - 07:39 AM

What's the value of z?


It's undefined.

Anything can be result, although result will most likely be a valid integer.

Even if software, compiler, OS is absolutely perfect, there's still hardware (the link + comments).


So unless multi-threaded/concurrent code guarantees atomicity, results are completely undefined. Unfortunately, they will make sense most of the time, making it harder to debug.

#13 ApochPiQ   Moderators   -  Reputation: 8419

Like
0Likes
Like

Posted 17 May 2012 - 10:37 AM

Correct. You win a cookie :-)

#14 Ripiz   Members   -  Reputation: 521

Like
0Likes
Like

Posted 17 May 2012 - 11:35 AM

How can it be undefined? In real problem it can, but in Your example y is always 2 and x is 1 either 4. It's not like individual bits can mess up resulting in 2, 3 or 5.

#15 ApochPiQ   Moderators   -  Reputation: 8419

Like
2Likes
Like

Posted 17 May 2012 - 11:42 AM

Unfortunately, that's entirely possible. Suppose the variables are not native-word aligned; or suppose they are larger than the native word size, e.g. 32-bit integers on a 16-bit CPU. Throw in some endianness issues to make it even more fun.

You can very easily end up in a state where you've written part of the bytes of z (or x or y!) but not all of them.


Assuming that things behave intuitively is an easy but deadly mistake to make when writing concurrent code.

#16 phantom   Moderators   -  Reputation: 4113

Like
0Likes
Like

Posted 17 May 2012 - 12:01 PM


I've not used multithreading before but I think it's time I start.

The game I'm working on tops at 125fps on target machines using 80% of a core. On lower spec machines, it usually runs about 60 fps with ease. The game is simple. It will be a while before I'll need multithreading.
There's no such thing as "I think": there's the need for performance. Or not. Until you need the performance, just practice it outside of main code base.


While I'm not arguing against your point I will say that when it comes to threading/concurrency in general it is VERY hard to bolt it on as an after thought; if you design with it in mind up front then you'll be ok but trying to add it in later is going to be such a mine field that you might as well (in general) start again.

#17 dublindan   Members   -  Reputation: 432

Like
0Likes
Like

Posted 18 May 2012 - 07:24 PM

Trying to shoehorn parallelism into a program is never the right approach. To really make use of it, the program must be designed with it in mind. Furthermore, shoehorning parallelism into a program does not really mean you have experience in multicore programming. Injecting people with random-but plausible medication does not mean I have medical experience. Also, if they are asking for professional multithreaded programming experience (which is what companies often mean when they ask for experience), this won't really help you either.

Having said that, if you really want to learn multithreaded programming, I would recommind reading both of the books I mention below. Multithreaded programming is hard and you won't get good at it reading internet tutorials. You probably won't get good at it by reading any one book. I've found both of these books to be very good, but even after working through them (and reading alone is never enough), you must practice what they teach.
After reading the books and having practiced the concepts taught for yourself, my recommendation is to never use this in production code at all - instead, find a task-based library written by experts and use that. Do not roll your own threading code and do not use threads for performance[1]! Instead use a task-based library. I personally recommend Intels Threading Building Blocks[2], though other equally good libraries exist.

Anyway, here are my two book recommendations (to be read in this order):
[1] Threads are a decent choice for concurrency but a bad choice for parallelism. That is, if what you need is asynchronous execution (eg, loading resources from disk without blocking while your game is running) - concurrency - then threads are a good choice. If, however, you want to use multiple cores to increase performance by doing multiple things at once - parallelism - then you should not use threads directly, but use tasks instead. The idea is that you have exactly one thread per core and work is scheduled between these.

[2] A good way of learning to use Intels Threading Building Blocks it through the OReilly book (though the tutorial on the official site is equally good). The first chapter is available here.

Edited by dublindan, 19 May 2012 - 01:24 PM.

Conker.io - predictive player behavioral game analytics (facebook | twitter)


#18 e‍dd   Members   -  Reputation: 2049

Like
0Likes
Like

Posted 19 May 2012 - 11:53 AM

The only thing it ever prints is 3. Using volatile because release build destroys loop in mythread(). Debug mode without volatile prints only 3 as well.


The program has undefined behaviour due to a data race (as the new C++ standard defines it). You must use atomics to avoid this. Even then, the resulting value is unpredictable (though it must be 0, 3, or 6 as seen by an atomic load).

#19 JohnnyCode   Members   -  Reputation: 55

Like
-2Likes
Like

Posted 24 May 2012 - 07:11 PM

carefull!
Implement multithreading only if it benefits you.
Threads take CPU time. There are 45 processes on windows that take 0cpu time.
Your thread takes 50 or 100.
If your thread shares memory with thread that runs it, you must lock the threads and bring serial run between them.
Powerfull but edgy

#20 Bacterius   Crossbones+   -  Reputation: 3861

Like
0Likes
Like

Posted 24 May 2012 - 07:53 PM

carefull!
Implement multithreading only if it benefits you.
Threads take CPU time. There are 45 processes on windows that take 0cpu time.
Your thread takes 50 or 100.
If your thread shares memory with thread that runs it, you must lock the threads and bring serial run between them.
Powerfull but edgy

Dude no offense but I have read your few recent posts and you don't seem to know what you are talking about (as well as having a relatively spammy behaviour). Threads don't run each other, they run in parallel and (optionally) share data (which may require synchronization, but also may not - if the data is read-only no synchronization is needed for instance). As for the time spent context-switching threads (is this what you meant by threads taking cpu time?), it is largely negligible unless you have hundreds of threads running in parallel in your own process (which is a bad idea anyhow).

"The best comment is a deleted comment."
website · blog





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS