Multithreading considerations

Started by
10 comments, last by VanillaSnake21 15 years, 3 months ago
Hi, I'm trying to multithread my small game framework but I want the framework to function well on singlecore/non HT CPUs as well. So my first question is, should I disable multithreading if the user does not have multicore/HT processor? And lets say If the user does have a multicore + HT how do I detrmine the number of threads that will give optimal performance? Is it even possible to detrmine the number of threads at runtime? Or is this usually hardcoded into the application? My last question is a little bit more specific, if all of the obove things are true how can I determine how many cores the processor supports or if it supports HT etc. Are there any libraries/API functions that do that? Thanks

You didn't come into this world. You came out of it, like a wave from the ocean. You are not a stranger here. -Alan Watts

Advertisement
Very likely, "trying to multithread my small game framework" will get you nothing but pain.

Threads are not something you can pour into your soup like that, you have to plan for threads from the beginning. Threads also aren't magic, at best they are bad mojo. They don't magically make everything better and faster, but they can very easily make your program crash when it would work perfectly before. Threads are great when used with prudence, but only then.

Getting the number of processors is something you have to query from the operating system. Under Linux, you can get this via sysconf(_SC_NPROCESSORS_ONLN), under Windows, you can try either GetSystemInfo or GetLogicalProcessorInformation. The latter is only available after Windows XP SP3.

You will usually not want to hardcode a number of threads, except if you know that they'll block anyway. More threads fighting over CPUs means more context switches, and that means less performance. Too few threads means cores being unused, which again means less performance.

Before you even think about adding threads, make sure you spend at least a week reading about synchronisation, or you will be very unhappy.
Also, do not make the mistake of trying to multithread your OpenGL or DirectX calls. Again, this will make your life very unhappy.
I think you're best bet is a Job Queue/Worker Thread setup. This should work well on a single core machine (1 worker thread), and scales up easily on multi-core machines (1 worker thread per core). With multiple cores you can probably get away with a few more worker threads than cores.

Implementing it will get tricky, adding threads to an existing program is just asking for trouble.

Regards
elFarto

Quote:Original post by VanillaSnake21
how can I determine how many cores the processor supports or if it supports HT etc.

In Java you can call Runtime.getRuntime().availableProcessors().
Actually I planned multithreading from beginning. The reason I didn't write my framework with it in mind, was because I wanted to undersand the difficulties of transforming a non-multithread app into a multithreaded one.
@elFarto that sounds interesting, any sources that show how to implement this?
@samoth I've read a section on multhithreding in an OS book and also in the Art Of C++, it doesn't look too complicated, call getmutex() ... releasemutex() in critical sections. I didn't expand my framework on purpose to simplify multithreading, so I think it would be as trivial as rewriting a couple of functions. Unless I'm missing something. By the way here's a basic layout of my framework:

I make a class that has:
Initilize()
Logic()
Render()
Shutdown()

functions;

I pass it to another class via SetApplication(GameClass* class);
that main class calls these functions in a loop.

Very simple and straightforward. The way I planned to thread it was making Logic() execute while Render() is executing. Is this a valid approach?

EDIT: BTW what about using CPUID asm instruction to get the processor config?

You didn't come into this world. You came out of it, like a wave from the ocean. You are not a stranger here. -Alan Watts

Here is a good place to start.

I personally would divide up the work as finely as possible, play a sound, process a message, parse a network packet, render a frame. Physics updates are a tricky one, ideally you'll want to split up each object into an individual job, but you'll start running into race conditions unless you double-buffer each object's state.

You'll probably also want to implement job priorities. HIGH, NORMAL, LOW should be enough. Obviously HIGH for audio and rendering, LOW for clean-up operations etc, NORMAL for everything else.

Mutex's aren't the only method for controlling concurrency. Atomic operations are another very good method. You should also look at Read-Write locks.

Regards
elFarto
Locking has to be done carefully. If done incorrectly, you can get crashes, deadlocks etc. Even done correctly, locking can cause trouble. Locking can make code behave in a serial fashion, which tends to perform worse than the equivalent single threaded code (due to context switching and locking overhead).
Quote:Original post by VanillaSnake21

Unless I'm missing something. By the way here's a basic layout of my framework:

I make a class that has:
Initilize()
Logic()
Render()
Shutdown()


Except that those are sequential operations. Render cannot occur before Logic completes.

Quote:Very simple and straightforward. The way I planned to thread it was making Logic() execute while Render() is executing. Is this a valid approach?
Yes, but there will be no concurrency.

Since render will need to wait on Logic, there will always be only one active threads, others will be waiting.

And there's another problem. What if logic runs 10 times as fast as render? What if renderer takes only 5% of logic? Again, serial dependency here means that you need to stall one of them.

The "proper" way to multithread the above would be to distribute logic over n threads, so that each processes some subset of data. In addition, logic is multi-buffered. It can read any value from old state, but can only write into the section of new state the thread was assigned to.

When all subtasks complete, mark that particular state as old state, and render that. Meanwhile, update will be working from this "old state" into some new "new state".

The above is optimal since it maintains sequential nature of Logic (updates from T(n) to T(n+1)), and renderer can always run in same thread whenever needed (usually it's not viable to do rendering from arbitrary thread, since it is inherently sequential operation, and some APIs may require it to be performed from thread that allocated rendering resources).

The biggest advantage of multi-buffered approach is that renderer always renders most recent (but old) complete state, while logic continues to progress. No locks are required, only atomic counter which determines when and which state is complete.

This is by far the most trivial way to separate logic from rendering.

In case of single-threaded design, there will always be one logic update followed by one render pass.

Quote:EDIT: BTW what about using CPUID asm instruction to get the processor config?


This isn't primary problem. Your framework will spawn n threads, where n might match the number of cores. But automatic detection is a non-issue - what matters is to design application in such a way to depend on n (an integer number), whatever it may be.
@Antheus & elFarto
Are you guys talking about the same approach, a thread pool? (I need a name for it to look up more info)
EDIT: If not which one is better (Thread Pool vs Antheus` approach) in my scenerio?

Quote:
This isn't primary problem. Your framework will spawn n threads, where n might match the number of cores. But automatic detection is a non-issue - what matters is to design application in such a way to depend on n (an integer number), whatever it may be.


So basically I don't need to worry if my multithreaded app will run on a machine that doesn't support MT or HT? So for example I let n = 4, and the client machine is dual core w/no HT, I should still leave it at 4 threads? Won't that degrade the performance?

You didn't come into this world. You came out of it, like a wave from the ocean. You are not a stranger here. -Alan Watts

Quote:Original post by VanillaSnake21
Actually I planned multithreading from beginning. The reason I didn't write my framework with it in mind, was because I wanted to undersand the difficulties of transforming a non-multithread app into a multithreaded one.
The difficulties are new application design.

It's much better to write multithreaded application and believe majority of people are using multicore CPU anyway. (Considering costs of E1400, or AMD CPUs, there is no reason why they shouldn't be.)

Well written multithreaded applications scale well on low core CPUs anyway, so why worry. The programmers just shouldn't abuse CPU too much, and use only what the application needs.

This topic is closed to new replies.

Advertisement