• Advertisement
Sign in to follow this  

Multithreading games

This topic is 4146 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm just starting to program a game and was thinking about making it multi threaded. Anyone with multi threaded game programming experience wanna give me some advice on how to implement it? If the work load was divided equally between threads would performance double on dual core machines?

Share this post


Link to post
Share on other sites
Advertisement
Quote:
Original post by etsuja
I'm just starting to program a game and was thinking about making it multi threaded. Anyone with multi threaded game programming experience wanna give me some advice on how to implement it? If the work load was divided equally between threads would performance double on dual core machines?


No. Some part of a program is inherently serial, and thus won't benefit from threading. See Amdhal's law for detail

There are featured articles on gamasutra that deal about this (they might need a free registration):
Multithreaded game engine architecture
Threading 3D game engine basics.

HTH,

Share this post


Link to post
Share on other sites
IMHO unless you actually need[1] it you should avoid having multiple threads as much as possible. They'll add considerable complexity, and a whole range of bugs, most of which are hard to reproduce and test. If you're just doing this for fun then multithreading is a whole heap full of headaches that you can do without (especially if you've not even finished a game yet).

[1] Ie. you know your target hardware will have multiple processors and you're already CPU limited.

Share this post


Link to post
Share on other sites
Quote:
Original post by OrangyTang
IMHO unless you actually need[1]

[1] Ie. you know your target hardware will have multiple processors and you're already CPU limited.

Heh. I've said exactly the same thing, almost word-for-word. But there are some exceptions, most notably background resource loading. Think GTA, Dungeon Siege, and any other game with a seamless world and no loading screens. You can't do that with one thread. Or if your game involves a CPU-intensive simulation and you want to maintain a decent framerate, multithreading may be beneficial.

If your game has none of these requirements, please don't adopt a solution in search of a problem.

Share this post


Link to post
Share on other sites
Quote:
Original post by drakostar
Heh. I've said exactly the same thing, almost word-for-word. But there are some exceptions, most notably background resource loading. Think GTA, Dungeon Siege, and any other game with a seamless world and no loading screens. You can't do that with one thread. Or if your game involves a CPU-intensive simulation and you want to maintain a decent framerate, multithreading may be beneficial.

If your game has none of these requirements, please don't adopt a solution in search of a problem.

Aye, background loading fits into the 'need' catergory, as it's pretty much the only way to do it. Depending on the libraries you're using sound and networking may also require additional threads, but they're usually quite easy to deal with as they only interact with a small section of your code.

Share this post


Link to post
Share on other sites
I think what I'll do is try to have the physics and maybe some AI running on seperate threads.

Share this post


Link to post
Share on other sites
Quote:
Original post by etsuja
I think what I'll do is try to have the physics and maybe some AI running on seperate threads.


The AI would most likely depend on the physical representation of the world, how would you handle that? You could lock all physics objects before modifying, but this could introduce very heavy slowdowns unless done very well (then it would only introduce big slowdowns). You should also be aware that this could introduce problems, because lets say the physics engine is trying to update frame n's content to what frame n+1 should look like, well sometimes the physics engine will modify first and at other points the AI will read first. So some of the AI (in some cases even the same agent) would depend on the n'th frame's geometry, but other parts of the AI will depend on the (n+1)'th frame's geometry. In extreme cases the AI could even be split up over more than 2 frames' content. This problem could be eliminated by making sure that the AI only touches a region when the physics engine has updated it, but then suddenly the AI engine is going to wait a lot on have to "follow" the physics engine. Wasn't this the kind of stuff we was getting away from? You could also choose to copy the whole physical representation before making any changes to it, so that the physics engine and the AI engine works with seperate data, this would elimante almost all shared data, but it would be expensive memory-wise and could also result in the AI working on earlier frames than the physics engine.

This was just to show you that it might not be as simple as it seems, now imagine if you also have to consider the graphics engine working on a copy of the same content and maybe even other engines too.

Share this post


Link to post
Share on other sites
Quote:
Original post by etsuja
I think what I'll do is try to have the physics and maybe some AI running on seperate threads.


These are the hardest things to seperate into threads, as your rendering will be constantly asking for position data that the physics code will be updating, and so will the AI.

I don't think I personally could come up with a multithreaded solution for physics AI and rendering that would be more efficient than running them all in the same thread...

Share this post


Link to post
Share on other sites
Im currently working on a game aswell (surprise surprise :) ) and the same question about multithreading came to my mind aswell.
I can see from your posts (and get the point why) that it is highly undesirable to have but wouldn't networkconnections be a suitable applicaiton for threading ?

Share this post


Link to post
Share on other sites
Quote:
Original post by ChristianJames
but wouldn't networkconnections be a suitable applicaiton for threading ?

Usually no. Ask yourself what the network communication affects. Game logic, right? So you need to poll for incoming network data, add it to whatever else is going on in the game world, then send out your own changes for each cycle of game logic (and by this I mean things like updating a player's position, checking for collisions, etc). Asynchronous I/O should do everything you need.

Share this post


Link to post
Share on other sites
I don't mean to come across as a dick but get out of the 90's people. :) There are plenty of standard techniques to solve the problems you're all talking about. All software is on the road to multi-threading. Don't fear it, embrace it. Hell people are now using the spare cycles on the GPU to do physics calculations! Do some reading, play around with things. You don't just need a bunch of confusing mutexes everywhere anymore. There's data buffering, message passing, etc.

Now that being said if you're just starting out learning how to code or how to create a game engine then by all means go single threaded. There's enough there to learn by itself. But really multi-threading really isn't that much of a pain. And you're going to need it to compete. There's a reason why most chips coming out now a days are multi-core. If you don't follow the trend then your ST apps will always be slower than someone who does MT.

Share this post


Link to post
Share on other sites
I would say that the engine and the UI should be in different threads. Music probably too. I would seperate all game code that does not need to be 1 to 1 synchronous with each other (by which I mean they do not mutually effect one another on a cycle to cycle basis).

For instance, most real humans are going to take a few game cycles just to click a mouse button. Does it really matter that your UI might not be 100% synched up with the game engine? By seperating the threads, you can keep the UI responsive when the engine goes belly up doing some intense calculations (never underestimate the feeling of safety a properly responsive mouse pointer gives. Microsoft learned this a while ago).

Likewise with music. If the music is a half second off from the action, it's hardly going to matter in 99% of cases (music videos being an obvious exception). But it will be noticeable if the music stutters or hangs during a computationally heavy part of the game.

In general, if something needs to be seem instantly responsive to the player and doesn't need to be instantly responsive to the engine, I'd put it in a different thread.

That's just my 2 cents, take it as you will.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
AI can often be separated aswell, it generally doesn't need up to date information anyway,
physics can be multithreaded (using a physics engine that takes advantage of it is probably preferably though since its not trivial).
for networking its common to use a separate thread but generally only for the accept call.

it is also possible to split calculations within the game loop over multiple threads.

for example



/---------physics---- / mainthread---------mainthread
\ /
\---------physics----/

basically you do a normal serial game loop but split to 2 or more worker threads for physics or AI or whatever, each of those then grabs a small portion of the work performs it and grabs another chunk, once both are finnished you proceed with the loop normally.

just because the renderer relies on the physics doesn't mean that some physics calculations can be done in parallel with other physics calculations.

it is important to note that multiple threads will ALWAYS reduce performance if you run them on a single core. (due to overhead related to managing the threads, distributing the workload, and synchronizing memory access), however properly optimized you can get 40-50% better performance even in a game.

Share this post


Link to post
Share on other sites
Quote:
Original post by Mike Bossy
If you don't follow the trend then your ST apps will always be slower than someone who does MT.

Uh, this is an era where a 2GHz processor is considered low-end, and the bottleneck is almost always the graphics hardware. Again, you're talking about a solution in search of a problem. What's the problem you're trying to solve with multithreading? The fact that it's cool or "the trend" is a pretty weak reason. I've written plenty of multithreaded apps, not to mention a microkernel OS. None of them have been games.

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
it is important to note that multiple threads will ALWAYS reduce performance if you run them on a single core. (due to overhead related to managing the threads, distributing the workload, and synchronizing memory access), however properly optimized you can get 40-50% better performance even in a game.


Well, this could be pretty easily solved by having a toggle option for multithreading.

Quote:
Original post by drakostar
Uh, this is an era where a 2GHz processor is considered low-end, and the bottleneck is almost always the graphics hardware. Again, you're talking about a solution in search of a problem. What's the problem you're trying to solve with multithreading? The fact that it's cool or "the trend" is a pretty weak reason. I've written plenty of multithreaded apps, not to mention a microkernel OS. None of them have been games.



What's the slowest dual core CPU on the market( selling at newegg anyways)? The slowest intel I could find was the Intel Pentium D 805 and the slowst AMD was the Athlon 64 X2 +3800. I'm not sure how the actual clock speeds compare, not sure how AMD rates their X2s, but I'm guessing the Intel is what it says at 2.6GHz. If a single core CPU with the same clockspeed as these can run the game at a minimum of 30fps then I might ditch the idea of multithreading. 2.6Ghz is a pretty hefty minimum spec for a game, but by the time I finish the game it gives people alot of time to upgrade. Only problem with seeing how it runs on these clock speed is that I'd have to have the game completed :/

One more thing... I did ask for people that knew about multithreading for games instead of other apps.

Anyone know how the clock speeds work on X2's? Are they actually what they say they are now or would they be considered faster?

Share this post


Link to post
Share on other sites
It's been a pain in the ass trying to find out X2's actual speed compared to Intels. As much as I could find it looks like X2 does 2 insturctions per cycle while the Pentium D only does 1 but I'm not completely sure.

Share this post


Link to post
Share on other sites
Quote:
Original post by etsuja
As much as I could find it looks like X2 does 2 insturctions per cycle while the Pentium D only does 1 but I'm not completely sure.
The reality of the matter is far more subtle and complex than that. I need to head off to bed, but I'll be happy to explain in detail tomorrow. Alternately if someone else wants to jump in, that's fine too.

Share this post


Link to post
Share on other sites
I'd be happy to hear from someone who knows alot about the internal workings of a cpu to explain the differences between some of the same era Intel and AMD CPUs.

Share this post


Link to post
Share on other sites
Quote:
Original post by drakostar
Uh, this is an era where a 2GHz processor is considered low-end, and the bottleneck is almost always the graphics hardware. Again, you're talking about a solution in search of a problem. What's the problem you're trying to solve with multithreading? The fact that it's cool or "the trend" is a pretty weak reason. I've written plenty of multithreaded apps, not to mention a microkernel OS. None of them have been games.


So you think that the 360 and PS3 have more than one core because it's cool? Graphics are just one part of the equation in any serious game. Give me a 2GHz processor for Physics calculations alone and I'll peg the thing. Let alone AI that's more than a simple state machine, Networking that does a good job at handling lag, or how about some DSP so I can have real time voice recognition going on to control some of the action. Show me any RTS and I'll show you enough pathfinding cycles to make your head spin. Games aren't just a couple of fancy pixels bouncing around any more. At least not if you want them to sell. CPU bound is just as common as GPU bound if not more so. The only difference is that instead of having the ability to "turn down the AI" or "turn down the physics" like most games have for graphics they just make the cuts across the board before shipping. After all no one wants to have a slider in the options menu that says make my CPU opponents dumber because my computer sucks.

If you've written plenty of multithreaded apps then you should know that approaching things from a multithreaded angle can actually help simplify your design as well. Being forced into thinking of systems as standalone entities makes OOP design an actual practice instead of a hope.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by etsuja
Quote:
Original post by Anonymous Poster
it is important to note that multiple threads will ALWAYS reduce performance if you run them on a single core. (due to overhead related to managing the threads, distributing the workload, and synchronizing memory access), however properly optimized you can get 40-50% better performance even in a game.


Well, this could be pretty easily solved by having a toggle option for multithreading.

Quote:
Original post by drakostar
Uh, this is an era where a 2GHz processor is considered low-end, and the bottleneck is almost always the graphics hardware. Again, you're talking about a solution in search of a problem. What's the problem you're trying to solve with multithreading? The fact that it's cool or "the trend" is a pretty weak reason. I've written plenty of multithreaded apps, not to mention a microkernel OS. None of them have been games.



What's the slowest dual core CPU on the market( selling at newegg anyways)? The slowest intel I could find was the Intel Pentium D 805 and the slowst AMD was the Athlon 64 X2 +3800. I'm not sure how the actual clock speeds compare, not sure how AMD rates their X2s, but I'm guessing the Intel is what it says at 2.6GHz. If a single core CPU with the same clockspeed as these can run the game at a minimum of 30fps then I might ditch the idea of multithreading. 2.6Ghz is a pretty hefty minimum spec for a game, but by the time I finish the game it gives people alot of time to upgrade. Only problem with seeing how it runs on these clock speed is that I'd have to have the game completed :/

One more thing... I did ask for people that knew about multithreading for games instead of other apps.

Anyone know how the clock speeds work on X2's? Are they actually what they say they are now or would they be considered faster?




if i remember correctly the AMD64x2 3800+ is basically 2 AMD64 3200+ cores on one die.

it should be about equal to a 2.8-3.2Ghz Pentium D, AMD basically adds ~20% to their "rating" if its a dual core cpu. (its a pretty fair rating for average computer usage, a x2 3800+ and a singlecore 3800+ are roughly equal for the average user (and have roughly the same price), if you do alot of multitasking or have software that are heavily optimized for SMP the dualcore versions are quite abit better though for non multithreaded games the singlecore versions are better.).


AMDs slowest dual core cpu beats the crap out of most Pentium Ds even with its lower clock speed,

intels core2 duo is even better, even though it only runs at 2.0-2.2Ghz.

the clock frequency has very little to do with performance when you compare different architectures, especially if you compare with the P4 who has an extremely poor performance/cycle ratio. (even the P3 did more work each cycle than the P4)

Share this post


Link to post
Share on other sites
Quote:
Original post by OrangyTang
[1] Ie. you know your target hardware will have multiple processors and you're already CPU limited.
In future, it's going to become more common. In a few years, dual (or even quad) core chips will becore the norm in PC's.
PS2 developers allready have to learn how to use 8 cores, tho im sure if you ask one about how easy that is they will agree that it is a bug-fueled-headache...

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by etsuja
It's been a pain in the ass trying to find out X2's actual speed compared to Intels. As much as I could find it looks like X2 does 2 insturctions per cycle while the Pentium D only does 1 but I'm not completely sure.


not quite true,
the AMD64 and AMD Opteron can handle 3 x86 instructions / cycle,
the P4 can handle 3 uOps / cycle, one x86 instruction may or may not require multiple uOps.

the P4 Prescott may be able to handle 4 uOps (not sure).

this however means that for simple instructions the P4 can do as much work as the AMD64 in a single cycle (addition, subtraction), for more complex ones such as multiplication and division.

the AMD64 can handle the x87 instructions FADD and FMUL in 4 cycles each while the P4 needs 5 cycles for FADD and 7 cycles for FMUL (the prescott may need less),

there are alot of other differences aswell, and the P4 is faster in some areas,

there are also issues such as memory access, cache size and speed, inter-cpu communication, etc,

AMDs HyperTransport technology helps quite alot aswell.

the AMD64 also has a very fast schedule-execute loop for integers reducing the cost of a cache miss.

theoretical performance and actual performance doesn't add up either, the best way to get a decent picture of the real-world performance is to look at relevant real-world benchmarks

Share this post


Link to post
Share on other sites
Guest Anonymous Poster

Multitasking does make sense in the following parts of a game engine:

1. Input (mouse) - this causes the mouse do run fluently even at low framerates
2. Input (keyboard) - to miss no pressed keys (already done in api)
3. Sound (playing) - to have seamless sounds at low framerates - done in all sound-apis
4. Networking - to not miss received packets (done in almost every network-api)
5. Precaching/Unloading sections of the Map


threads for rendering/physics/ai:
if you can do this strongly depends on your engines design.
many engines do the following:
in a scenegraph:
1. calc the reacting ai (guys do what they planned to do)
2. reacting on input (calc new player position, etc.)
3. calc the physics (if player runs through the wall reset him, etc.)
4. build a render-tree (for sorting by renderstates thus minimizing state changes)
5. render
6. calc planning ai (whats the new player-position, what to do next)

important things are:
- ai shouldnt be done when calculating physics
- rendering cannot be done when physics read positions etc.
- in general: if one thing uses the scenegraph, others cannot access it.


but, there are some things you can do at the same time:

you can separate the rendering: (even on single-cpu systems)
when you go through your scenegraph an calc your visible set of things, just throw them in a list. then, start a thread that takes this list and sorts it by renderstates and renders everything. both threads (the original and the rendering one) have to "meet" again before the new visible items list is passed. e.g. in direct x the draw-calls have to wait until the graphics card says (ok, done) and then return, doing nothing inbetween. even on a single cpu-system, this threading can improve peformance. but since ogls render-calls dont have that "overhead", the speed gain is much less to none.
remember: the renderer doesnt only need to know about the visible things but also about objects that are between your view frustum and a light-source thus throwing visible shadows...

separating the ai: (on single cpu, theres no gain but only few speed loss)
you can do this similar to the rendering separation.
for each ai-guy throw all relevant things into a list.
for each ai sort this list (calc importance of possible actions etc) and calc ai.
you can start with this right after you have calculated the actual physics.
"relevant things" are (e.g.):
player position (e.g. the waypoint node)
waypoints (attention: if your ai uses waypoints, there is no need to know about level geometry!)
this thread has to meet the main thread when the main thread wants to see what your ai-guys want to do next.


some others try to seperate more things into threads, but ive never seen a speed gain on that implementaions, mostly the code for keeping threads in sync slows everything down, since the mutexes cause the threads to wait and do nothing (and that adds up). in these concepts, the threads are easy to keep in sync, since there are fixed points where they have to meet the other threads. also the debugging isnt that hard since every thread only does a fixed number of things (make use of log-files for each thread to find errors!).

Share this post


Link to post
Share on other sites
Ok, then I guess I'll go by the closest performance of the processors running single threaded apps. For AMD the single core processor that comes closest to performing the same as the X2 3800 would be the Athlon 64 3400 clawhammer. And the dual core Intel chip that has the smallest performance difference is the Pentium D 960, and the closest performing single intel single core is the Pentium 4 660. So if I can get at minumum of 30 fps on those single cores then I'll drop. All this is from Tom's Hardware cpu chart of Quake IV. The 3800 got 77 fps while the 3400 got the same. Anyways the Pentium D 960 got 74 fps and the Pentium 4 660 got 75 fps. This was all with using an x1900xt though. I don't know but I'm guessing any gamer will have something at the same level of the x1900xt when my game gets finished and they'd most likely have better CPU's too. So if I'm making my game to be along the lines of Quake IV graphics wise I should get quite a bit more than 30 fps. So overall I guess multithreading wouldn't really make a noticable difference. So maybe I will ditch multithreading.

I'm starting to believe this "dual core is all a marketing ploy" thing. At least for gamers anyways. Sucks for me... I just ordered a X2 4400. I could have got a FX-55 for only 4 more dollars :( Oh well though, it's only a 7 fps difference and they're both above 60 fps :) Though I did see a benchmark of Quake IV where there was a 23 fps gain with SMP turned on. So there is some use for multithreading in games. Imagine if the FPS was 10 with multithreading turned off for a game. Anyways what I'm getting at is multithreading isn't viable for what I'm working on. Although I do plan on having streaming content.

[Edited by - etsuja on September 22, 2006 2:36:38 AM]

Share this post


Link to post
Share on other sites
There is of course more than one way to crack a nut. When the PS3 came out, I came up with an engine design that would take advantage of the PS3's multiple cores. My design idea has been mentioned in some of Sony's technical conferences so it's got some merit. It is only an idea, although I've been thinking about it quite a bit I've not had the opportunity to try it out (bills do need to be paid!). It's a highly scaleable idea that would work on one, two, four or more processors without having to change the code. Basically it's this:

Rather than thing about a problem as a linear set of steps, think about the problem as a set of discreet chunks. Each chunk of work is placed in a fifo queue. Whenever a CPU core has finished doing a chunk of work, it takes the first item in the fifo queue and starts to execute it.

There is a lot of extra stuff to this: having chunks dependent on other chunks, locking memory, waiting on memory, having chunks create other chunks, etc...

Skizz

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement