Cost of creating threads

1,173

Author

October 21, 2011 01:32 PM

I'm trying to decide between a couple different threading architectures for my game engine. I want to get the threading integrated right from the start so that I can build everything in a threaded environment, and not have major headaches later when I try to add threads to a single threaded engine.

My first idea is just to use threads everywhere lets say I have an init function in some class that in single threaded form would look like this (psuedocode):





Init()



{



load file from disk or zip file



process file



do unrelated variable initialization



do creation of scene nodes



}

The idea would be to do creation of threads as they would be needed; for example:





Init()



{



Thread(load file from disk)



initialize the unrelated variables here



Thread(initialize the scene nodes)



WaitForThread(load file from disk)



now process the loaded data



}

The thing about that one is, I don't know if creating threads all the time would incur too much overhead. The other idea I had was to have a task manager. It would create as many threads as processor cores and update an equal number of tasks on each core. So my previous code example would just replace the threads with processes; which may or may not be updated on a seperate core, depending on how many processes are running at once.

Which one sounds better?

http://www.kongregate.com/games/3DModelerMan/replicator#tipjar

samoth

9,833

October 21, 2011 02:03 PM

Creating threads does have measurable overhead, but it is not quite as bad as people often believe (this varies from platform to platform, but in general it's moderately cheap).

There are other good reasons why you would not want to create threads on demand like this, however. You absolutely want precise, reliable control about how many threads are running and how many threads do what.

Imagine spawning a thread every time you want to load something from disk. This is great because it does not block the main thread while the harddisk loads and while the thread decompresses the data and so on.
And now imagine how the drive's head is feeling while you're having 20 threads competing for hard disk time. The OS will be able to schedule some of that, but it will still end in a LOT of seeking and thus performance lost (and, breaking your poor harddrive).

Similar can be said about CPU cores, context switches, and caches (or TLBs). Ideally you want to have N threads running on a N core machine, no more and no less. Less means you give away computing power that you could use, and more means you have context switches and cache effects.

And the winner is: IO completion ports (if you're under Windows). A strategy that works very well is to create "some number" (10 or so) threads and block them all on an IOCP. Then give them work to do by posting on the completion port. The OS makes sure that N threads are running (waking another if one blocks in IO, for example), and keeps caches warm by reusing threads LIFO.

Unluckily... no such thing under Linux

Kobo

128

October 21, 2011 04:58 PM

You should look into threadpools. You make a bunch of threads ahead of time and just assign ones that aren't busy to tasks when they're needed and let them sit idle the rest of the time.

DracoLacertae

518

October 21, 2011 06:18 PM

[font=arial, verdana, tahoma, sans-serif][size=2]

You should look into threadpools. You make a bunch of threads ahead of time and just assign ones that aren't busy to tasks when they're needed and let them sit idle the rest of the time.

I second the suggestion of threadpools. In other (non-game) projects, I've found them extremely useful. The threads can share a common queue of tasks. In complex implementations you can prioritize tasks, so its not strictly FIFO. You can do this really complicated or really simple. I have a simple threadpool that is basically FIFO, where all new tasks go at the end of the queue, unless a task is flagged as 'urgent', in which it cuts in on the top. I can see a problem if there's many 'urgent' tasks happening at once (the 'new' urgent ones will trump the 'old' urgent ones), but if I have a system where the thread pool is so bogged down from urgent tasks it can never get to the mundane ones, I have a bigger problem to look at.[/font]

3DModelerMan

1,173

Author

October 21, 2011 08:19 PM

[font="arial, verdana, tahoma, sans-serif"][quote name='Kobo' timestamp='1319216313' post='4875076']
You should look into threadpools. You make a bunch of threads ahead of time and just assign ones that aren't busy to tasks when they're needed and let them sit idle the rest of the time.

I second the suggestion of threadpools. In other (non-game) projects, I've found them extremely useful. The threads can share a common queue of tasks. In complex implementations you can prioritize tasks, so its not strictly FIFO. You can do this really complicated or really simple. I have a simple threadpool that is basically FIFO, where all new tasks go at the end of the queue, unless a task is flagged as 'urgent', in which it cuts in on the top. I can see a problem if there's many 'urgent' tasks happening at once (the 'new' urgent ones will trump the 'old' urgent ones), but if I have a system where the thread pool is so bogged down from urgent tasks it can never get to the mundane ones, I have a bigger problem to look at.[/font]
[/quote]

That sounds good. I think I'll go with thread pools. Is boost::thread going into C++0x? If I just typedef boost::thread as std::thread then I should be fine and have no problems when C++0x comes out right?

http://www.kongregate.com/games/3DModelerMan/replicator#tipjar

c_olin

197

October 21, 2011 08:32 PM

As an added bonus lambda functions work great for implementing thread pools:

[source lang="cpp"]
// Queue some parallelizable calls.
ThreadResult<> physicsResult = ThreadPool::queue([this, &deltaTime]() { this->physics.update(deltaTime); });
ThreadResult<> sceneResult = ThreadPool::queue(this, &deltaTime]() { this->scene.update(deltaTime); });

// Do some work in main thread while we wait.
logic.update(deltaTime)

// Wait for parallel calls to return.
physicsResult.waitUntilDone();
sceneResult.waitUntilDone();
[/source]

Notice how I can easily add arguments from the current scope to the threaded calls. Also return values can be handy:

[source lang="cpp"]
ThreadResult<int> returnResult = ThreadPool::queue<int>([&]() { return thisReturnsAnInt(); });

// Do stuff...

int result = returnResult.waitForResult();
[/source]

3DModelerMan

1,173

Author

October 22, 2011 02:44 AM

Thanks for the responses. I know what I'm going to do now (thread pools). I just need to read up on them and lambda functions. Thanks.

http://www.kongregate.com/games/3DModelerMan/replicator#tipjar

Cost of creating threads

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Cost of creating threads

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines