Multicore Game Engine

Started by
3 comments, last by Antheus 12 years ago
So I'm currently working on a multicore engine with ideas taken from this intel article:
Designing the Framework of a Parallel Game Engine

I'm still trying to figure out if I should separate my engine into some main asynchronous threads or to keep everything in 1 thread and use only the tasks.

My idea so far is to have the Game/AI/Physics be one main thread. The graphics be another main thread that updates asynchronously from the game. The sound output would be its own thread as well. And each network connection would be its own thread.

I have a pretty good mechanism in the works for keeping them all in synch, but I'm still wondering if this is going to give me the benefits I'm thinking I will get.

Multiple Asynchronous threads:

Pros:
Game loop updates at 60 times per second and is completely asynchronous from graphics or audio...

Cons:
The engine is a bit more complicated and a little harder to keep in synch but not really...

One main thread:

Pros:
A bit simpler. It may or may not perform better.

Cons:
The Game thread will need to have its timestep fixed if things like graphics slow it down to less than 60 FPS.
Advertisement
I've yet to see a common situation where you can do better than a primary thread and a task pool*. It's extremely hard to get synchronization right across multiple threads especially when they don't have very explicit sync points.

In short, unless you really know you need to mess with fine-grained threading, I'd say do the simple thing and just write a master thread that issues parallelizable work to your task thread pool. It'll be a lot less headache and you can leverage libraries like Intel's TBB et. al. to do the heavy lifting.


*NB for obsessive completeness: there are situations where you can beat out a task pool using fine-grained threading. It's very difficult, though, and not at all common.

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

I actually do have a way to get things staying in synch.

There are explicit synch points in each main thread. They pull in updates at each update from other threads using the system that keeps things in synch. I'm currently working on coding it, but I have it all thought out.

At this point, the simpler thing is to have the split up main threads, for now at least since I already came this far. I might go back to one main thread later though if the performance is more worth it doing it that way, which is what I'm not really sure about...
There's two reasons to use multiple threads:
A) You've got processor intensive routines that you can run in parallel.
B) You have to call a slow, blocking function, but want it to be asynchronous.

In any case, if you're running more threads than your CPU has 'hardware threads' then it's called over-subscription, and is a waste of performance. The opposite, where you're running less threads than your CPU can handle is called under-subscription, and is a waste of potential. Ideally, your game will make use of the same number of threads that is supported by the CPU (or less, to allow for the OS or other apps).

My idea so far is to have the Game/AI/Physics be one main thread. The graphics be another main thread that updates asynchronously from the game. The sound output [and networking] would be its own thread as well.
So you've got a fixed number of 4 threads (assuming one network connection), which means you're targeting quad-core CPU's only. On dual-core, you'll be over-subscribed, and on hex-core you'll be under-subscribed.

Each of your threads above has a very different workload. The Game/AI/Physics thread is going to have a very large workload, the graphics one a mild workload, and the audio/network threads are going to be almost entirely idle. Ideally, each of your threads would have almost identical workloads.

AI/Physics/Graphics are a case of (A) -- you've got a lot of work to do, which can largely be done in parallel. Ideally, you'd split the workload of each of these amongst each of your threads to balance it over the CPU, and use it's full potential.

Networking is a case of (B) above -- instead of wrapping a blocking API in threads to make it asynchronous, just use an asynchronous networking API to begin with -- so you don't need separate threads dedicated solely to networking.
There are explicit synch points in each main thread. They pull in updates at each update from other threads using the system that keeps things in synch. I'm currently working on coding it, but I have it all thought out.[/quote]

Downsides of this approach:
- aliasing (similar to moire effect). Since work units are discrete, a long running task or uneven number of work units will cause sporadic delays, sometimes just enough to require one frame more. FPS will therefore jitter. Only solution is to ensure tasks are always small enough, which is difficult to do on diverse hardware (PC). This may be one cause of so-called micro-stutter, though it's more frequently caused by interrupts (USB or poor drivers).
- guaranteed sub-scalar performance. By explicitly syncing, system like that will be consistently sub-optimal. For above reason it may be vastly so, to the point where benefit of threading is lost (there are only 4 threads at best and factor of 4 loss happens quickly)

Above issues can be mitigated, but solutions may quickly become too complex.

This topic is closed to new replies.

Advertisement