Multitreading, what to and what not to thread?

Started by
10 comments, last by Hodgman 13 years, 3 months ago

A lot of traditional material on "multi-threading" focuses on the practice of having shared memory between threads, synchronised via mutexes/semaphores/etc...

However, this idea of "what can I put onto a thread?" is very outdated. The question is now, "how do I write game code that runs on a modern (multicore, or even NUMA) CPU?". To make a game (engine) that seamlessly scales to multi-core CPUs, you need to be writing things at a completely different level than threads.

This is important: To write multi-threaded code, you shouldn't be dealing with threads.

Threads are used at the lowest level of your multi-core framework, but the end user of that framework (i.e. the game programmer) shouldn't even have "thread" in their vocabulary. You (the low level framework author) will use threads to implement a higher level model, such as flow-based programming, of functional programming, or the actor model, or a message-passing interface, or anything where the concept of threads/mutexes isn't required.

Things like DMA transfer alignment, or number of hardware threads, or cache atomicity should all be transparent at the game programming level. The game programmer should just be able to write functions, which your system will execute safely in a multicore environment automagically.


You've got the right idea, but it depends on what your role is. If you're the lead technical guy on a project, you should still be highly knowledgeable about everything down to the lowest level. You still need to deal with threads to some degree and you need to be aware of the memory model of your target architecture and programming environment. If you try to develop high-level abstractions that hide threads, you won't be able to take advantage of things like thread affinity to ensure that your thread schedulers are NUMA aware. Many modern multi-core single CPU systems are somewhat non-uniform in the memory architecture if you consider separate L1 cache for each core. If you try to build abstractions that hide the memory model, you may end up with solutions that do not take advantage of cache coherency. Memory management is intrinsically tied to your threading model when it comes to performance and scalability. In languages where you have manual memory management, unfortunately, the only way you can keep things NUMA aware it to also do manual thread scheduling.

And you can't use a single type of thread pool / task scheduler for all problems in a general manner and expect to get optimal performance in each case. There's far more than one way to implement a scheduler, and each has different tradeoffs and benefits. Here in lies the problem with general purpose frameworks and language abstractions. Frameworks like Intel Threading Building Blocks and Microsoft's Parallel Patterns Library are good starting points, but people need to understand that those libraries just give you a hammer, a hand saw, and a screwdriver, but when you need a cordless reciprocating saw with a blade capable of cutting through metal pipe, you're out of luck. It's quite possible to out-do these libraries.

I know you were more getting at is that people shouldn't be be sticking to the traditional paradigm of trying to shoehorn in a few extra threads and throwing blocking synchronization primitives around shared data, because that just does not scale at all. You've got to jump in at the deep end and think about concurrency on a whole new level. And you're right that it should be as easy as possible for those working at the highest level writing game logic and game components and what not. But you shouldn't deceive yourself that it's possible to hide everything and generalize concurrency for all use cases. You may no longer be shoe-horning in a few extra threads, sure; instead you're just trying to force everything into a small handful of concurrency abstractions and that's not always much better.
Advertisement

it depends on what your role is. If you're the lead technical guy on a project ... You still need to deal with threads to some degree and you need to be aware of the memory model of your target architecture and programming environment.
This is the guy writing the multicore system though, not the guys using it to make "multithreaded" game code.
If you try to develop high-level abstractions that hide threads, you won't be able to take advantage of things like thread affinity to ensure that your thread schedulers are NUMA aware. Many modern multi-core single CPU systems are somewhat non-uniform in the memory architecture if you consider separate L1 cache for each core. If you try to build abstractions that hide the memory model, you may end up with solutions that do not take advantage of cache coherency. Memory management is intrinsically tied to your threading model when it comes to performance and scalability. In languages where you have manual memory management, unfortunately, the only way you can keep things NUMA aware it to also do manual thread scheduling.What I mean is at the game level, you can use an abstraction like the data graph on page 26, here, and then theres a platform-specific system that can execute that abstraction via a Windows thread pool, or via SPURS, or a home-made job scheduler, or whatever. Each one of those platform-specific system will decide how to package up the inputs/outputs depending on DMA/cache behaviours/requirements (meaning you do take advantage of all those things).

This topic is closed to new replies.

Advertisement