multi-thread

Started by
10 comments, last by frob 10 years, 7 months ago
The Nexus 7 currently has 4 cores. Almost every processor sold now seems to have >1 core.
To take advantage of this, I think you need more than one thread. In a game, what could the threads do? Is there a way to take a single-threaded game and make it multi-threaded so it takes advantage of multiple cores?
I remember Intel had a hyper threaded contest, but I couldn't enter my game cause it only used one thread, I think.
Advertisement

You could actually do almost anything in separate threads for example:

  • entity updating
  • ai calculation(f.e. pathfinding)
  • physics
  • collision detection
  • anything that's not talking to the graphics api

Jan F. Scheurer - CEO @ Xe-Development

Sign Up for Xe-Engine™Beta

Code runs in a linear manner, and for the most part this is fine for what you'll need. Take for example this:


//Gotta do this first
DoThis();

//Gotta do that second
DoThat();

You can't do that and until you've done this. It moves in one linear motion from top to bottom. For most programs you'll only ever need to do one thing at a time and this will work fine, but for game programming this can be a problem. Look at this for example:


//Gotta do rendering
DoRendering();

//Gotta do some physics calculation
DoPhysics();

//Can't forget about the AI
DoAI();

All of those functions usually take a lengthy time to accomplish and doing them in a series like this would only cause major lag. By putting them each in a different thread they can run concurrently and not be dependent on the each other to be finished. Of course you'll need to design your program to keep their asynchronous behaviors in check.

This can be utilized for a many very different reasons, but just because you can doesn't mean you should. Threading can get quite confusing and lead to issues if you have no idea what your doing. Even with some big games, multi-threading may not be necessary.

I was somewhat hoping the old "let's put each module in a separate thread"-approach has died out a while ago, when developers realized that having all these threads constantly compete for the same resources is a messy nightmare of synchronization and race condition bugs. Unless you make sure that all the modules working in parallel only read the same data. Alternatives usually involve either a lot of copying or a lot of locking. Don't be surprised if the overhead ends up making the whole thing pointless.

A first step is to simply use concepts like parallel_for to split up one huge chunk of work to be processed in parallel (note: make sure each element can be processed independently of others). After that, look into task based parallelism. Intel's TBB library and especially its documentation might be a good place to start.

f@dzhttp://festini.device-zero.de

So, threads serve a couple of different purposes for different situations. 1, even on single core machines 2-4 theads are a good idea when used right. In this case the goal is to let some work keep getting done, while 1 thead is blocked on IO. The oldest example of this is spawning a thread for networking, so that the main thead just QUEUES a network message to send, but the network thread sends it. And the network thread listens for messages coming in, but notifies the main thread when a complete full and valid message has been received. Other great examples of this are having the AI logic run in background threads. For this kind of threading you usually have almost every thread get locked down while UPDATING your game state in your game loop (IE, during the "Tick()" part for the game engine) ... but then when the update phase is over, all the different threads can go to work, doing Screen Rendering, AutoSave writing, AI processing, etc. Many of those tasks block on IO, so it can help dramatically.

Now the other, primary purpose nowadays for mutliple threads is to use multiple cores .. which A) gives even more benefit, but B) has a lot more things to consider. Since the different cores have separate caches, if they all have to read the same data, there is some lower performance than if they can work on different data sets. But those different data sets can be subsections/partitions/tiles of a greater common set of data. For instance if you are proceedurally generating a map, depending on your algorithm, you could possibly "tile" your world and let each thread work on separate tiles. In that case life is easy, you write 1 function doing 1 job, but you just paralellize it. Most good threading is a bit harder to optimize. But a general rules of thumb is first, trying using a separate thread for each logically distinct primary function or code path (up to about 2-3 per core depending on architecture), and then profile the code during testing to see where the "blocking" occurs. If certain threads block each other a lot, just consolidate those threads ... it is FAR easier to have 18 logical threads in a game, that during tuning you adjust into 6 or 8 ... than the other way around. Splitting something into threads is almost as hard as writing it ... putting 2 functions into 1 thread is a 1 minute operation.

Simple / Standard Thread Ideas:

* Networking (totally separate from all others, and interacts via Message/Event queue).

* UI Receiving (accepting UI events and turning them into commands - can be queued or unqueued)

* Main Game / Update Loop (ie your game logic and physics engine, etc. everything that WRITES to the main game entities). This is often more than 1 separate function, that run sequentially, but not in separate threads .. UNLESS you have the ability to "partition" your data set (like if you have a turn-based sci-fi game, you could process each solar system independently and use multiple threads to do so.

* AI player decision making

* File loading / saving

there are many other ways to use threads ... but the above apply to most games.

1) Don't assume that one or more cores are not in use in the background by the OS and improving your performance by not making the core you're using wait.

2) THREAD LIGHTLY. Threads open up the way to all manner of possible problems, and implemented naively they can and will harm performance rather than helping it.

3) General concurrency is a better perspective for taking advantage of multiple cores. Threading is not the only means of implementing concurrency.

4) Do not prematurely optimize. If you have no performance issues then don't waste time fixing what isn't broken, especially at the cost of consuming additional system resources.

5) In simple games, audio processing is a good first target for threading, since it generally does not need to interact much with the main program. You can simply send messages to the audio thread to control it like a jukebox.

6) Network updating can be used with threading, but it's not completely necessary. If implemented incorrectly this can cause excess work with no performance gain, so again, make an informed decision rather than just threading because you can.

Threading is one of those areas where you really want to spend the time to learn about all the features, quirks, and potential pitfalls before using it in a project. It often creates a lot of code work and can so very easily backfire either by causing hellaciously difficult bugs or just by doing everything correctly but ending up harming performance rather than helping it due to resource contention. I'd recommend picking a thread library, getting familiar with all of the available features, then using that as your jumping off point for studying articles about good and bad uses of threading, paying close attention to the reasons why the bad things are bad.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

Code runs in a linear manner, and for the most part this is fine for what you'll need. Take for example this:




//Gotta do this first
DoThis();

//Gotta do that second
DoThat();

You can't do that and until you've done this. It moves in one linear motion from top to bottom. For most programs you'll only ever need to do one thing at a time and this will work fine, but for game programming this can be a problem. Look at this for example:




//Gotta do rendering
DoRendering();

//Gotta do some physics calculation
DoPhysics();

//Can't forget about the AI
DoAI();

All of those functions usually take a lengthy time to accomplish and doing them in a series like this would only cause major lag. By putting them each in a different thread they can run concurrently and not be dependent on the each other to be finished. Of course you'll need to design your program to keep their asynchronous behaviors in check.

This can be utilized for a many very different reasons, but just because you can doesn't mean you should. Threading can get quite confusing and lead to issues if you have no idea what your doing. Even with some big games, multi-threading may not be necessary.

Ok, except that all ai() render() and physics() need to acces the same game state data and all it need to be still and coherent,

how would you resolve that?

Code runs in a linear manner, and for the most part this is fine for what you'll need. Take for example this:




//Gotta do this first
DoThis();

//Gotta do that second
DoThat();

You can't do that and until you've done this. It moves in one linear motion from top to bottom. For most programs you'll only ever need to do one thing at a time and this will work fine, but for game programming this can be a problem. Look at this for example:




//Gotta do rendering
DoRendering();

//Gotta do some physics calculation
DoPhysics();

//Can't forget about the AI
DoAI();

All of those functions usually take a lengthy time to accomplish and doing them in a series like this would only cause major lag. By putting them each in a different thread they can run concurrently and not be dependent on the each other to be finished. Of course you'll need to design your program to keep their asynchronous behaviors in check.

This can be utilized for a many very different reasons, but just because you can doesn't mean you should. Threading can get quite confusing and lead to issues if you have no idea what your doing. Even with some big games, multi-threading may not be necessary.

Ok, except that all ai() render() and physics() need to acces the same game state data and all it need to be still and coherent,

how would you resolve that?

I don't know :) I guess it's something for you to figure out. What I said wasn't the answer nor did I say how it to be done exactly. It was a hasty example just to show the idea behind threading and why it may be used, not necesarily how to do it. Sadly I don't know, I haven't done it yet... Though once my engine reaches the need for multi-threading... I can't wait to jump in ^_^

Ok, except that all ai() render() and physics() need to acces the same game state data and all it need to be still and coherent,

how would you resolve that?

I don't know smile.png I guess it's something for you to figure out. What I said wasn't the answer nor did I say how it to be done exactly. It was a hasty example just to show the idea behind threading and why it may be used, not necesarily how to do it. Sadly I don't know, I haven't done it yet... Though once my engine reaches the need for multi-threading... I can't wait to jump in happy.png

ye, without it your example is not too much 'usable' ;-) though youre right I can try to figure some things out here (and maybe add some explanations)

For example is seem render can be paralelised because it only read game state so you could just binary copy it then render this copy and in the same time update the oryginal state

- but doing ai movements and physics object movements would be harder because in general movements may interact and collide

(If you even resolve al detailed collisions by locks it is yet

still the problem of race conditions i think - the outcome of

(besides that deterministic) frame update would bring different

results depending of reaces of the threads in the time line over the entities - I am not sure if such races in some extents are acceptable or should be avoided at all (I would like to better avoid that)

So it is mess - maybe some other ways of paralelizing it

could be done - maybe some lenghty and temporaly and

ram independant routines can be found then it could be run

in parallel - depending what is lenghty and what can be

doin in parallel, It need practical knowledge about such

game internals and also is somewhat spoiled by the fact

that you need to paralelise in small time granulality (frame

is about 20 milisekonds ) So I do not know if speaking about general answer

Some multi-threading tips

One of the biggest issues in multi threading is actually benefitting from it, so keep things REALLY simple. The reason being you want there to be very few places where your concurrency issues may be occuring.

So:

1. An example:

I create a class thats specifically for one purpose only, for one (1) and exactly one thread to work on. That means, if i have 4 worker threads, i'd need 4 objects.

The threads will work on those objects and once they're done i'll get some result which i can use in whichever way i like.

An example here is particles. On the one side I have the particle class, and on the other i have the renderable particles list.

I can send a job to a thread to output a mesh of renderable points, which on the rendering thread really only requires one thing: a boolean variable telling the rendering thread whether or not the particles were updated. If they were, i will upload them to the graphics API. Keeping it simple and all that.

Or even simpler, have the physics thread do the particle updates, and use the same method to "send" them to the rendering thread.

To accomplish this I'd use a mutex to prevent a small amount of variables from being written to while i try to read them. These variables can be for example:

struct renderable_particles_t

{

particle_vertex_t* particles;

int renderCount;

bool updated;

};

that's all i need to be able to upload the particle data to the GPU, and so the solution is probably sound smile.png

Note that this "update" needs to happen in synch with the camera, so that the particles flow perfectly. The same goes for all other synchronizations that are to be rendered.

2. The things that must be threaded

Typically sound and networking. Both are so much simpler when threaded, such as blocking sockets and just letting the sound system do its thing in peace.

You will need to synchronize your interactions with the threads, but that is easy enough. For sounds you only really send information one way, since you want to play sounds or music.

For networking it may be a little bit more complicated. I don't know what people usually do, but i use queues for both read and write. That's 2 queues, but only one synchronization window. Once you synchronize send as much as you can, read as much as you can, then release the lock. What you've read is now in the read queue on your side which you can do whatever you want with. While the write queue is for the network system to work with.

3. There are a few instances where you don't have to lock, but make sure it's not something important

Imagine that your player can crouch, and that crouching is only written to on the "physics" thread (let's just call it that). Well, heres the trick:

Only set it ONCE, as in, do every test you have to to figure out if the player is crouching, but at the end, set it once. If you do that the rendering thread can happily read from the variable, and even if it just missed it, 60fps makes sure no one cares, or will ever notice that there was a 1 frame lag between the change

The reason we can do this with many such things, such as crouching, is that it's not something that changes often. It relies heavily on human reaction times

Note that this opens up a can of worms called cache incoherency, which could reduce your performance if you go overboard with it

It can help you wrap your head around threading - because if one thread is reading and the other reading and writing - everything is good

The problem is if you spread these variables that you are reading from other threads all around memory then the CPU will be spending time synchronizing cache between cores. You get around this issue by collecting the data you need and place it in a "tight" container, so that the memory is linear and close. Then you synchronize only once each "round". This is essentially what is being done in no#2, and it's almost always the best solution.

I'm not saying #3 is useless, and i have absolutely no clue how bad things get when you read from multiple locations each frame.. I'd better read up on it

4. Lockless, double-checking, degree of parallellism, keeping it real

Lockless: The absolute nightmare, but everyone needs to know about these..

Double-checking: The absolute don't-do-this of multi-threading

Degree of parallellism: Some things are easy to implement multi-threaded, some aren't.. Personally I think really hard before I throw threads at the problem

Finally, keeping it real:

If your game feels laggy or stuttering, it's very likely you aren't skipping work that you don't "have" to do NOW or immediately

There are many many ways to avoid doing work in games, and they all contribute to your game feeling smooth and awesome smile.png

Some examples:

1. Doing something every Nth frame

2. Doing something every Nth * depth frame (the further away it is, the less frequently we update it)

3. Simply checking if we are close to running out of time, typically 16.7ms (0.0167 seconds), and if we are, return immediately

4. Making sure that the graphics API is able to run through a frame without waiting for something to finish

The last one is probably surprising to many, but very important!

Here is a real example of avoiding work:

I have a giant list of particles, but I measured that only 1/3 of all the particles were in the camera at time of rendering. So i took the dot product between direction(player, particle) and camera look vector, and never added particles that weren't "in front of camera" to the list. A very cheap calculation that helps me avoid rendering 66% of the particles in the simulation.

Finally:

When it comes to multi-threading - if there's an easier way, the easier way is the better way

And don't take my examples as being good.. it's just where im at for the moment

This topic is closed to new replies.

Advertisement