Sign in to follow this  
Alpha_ProgDes

Making program/thread that take advantage of HyperThread or Multi Processor computers

Recommended Posts

**Moderator: if this belongs to the lounge, please move immediately. Thank you.** I'm a little clueless and curious on how a program is supposed to take advantage of a mulit processor automatically. I always thought that if a program spawned more than one thread then if a multiprocessor was present it would automatically take over that thread. Or if two programs (which is the case 99% of the time) are running at the same time each processor would execute one of the them. Could someone drop a little pseudocode or knowledge on how we as programmers are supposed to take (full) advantage of this technology or how we as consumers should know when to buy such a thing? Thanks.

Share this post


Link to post
Share on other sites
As far as I'm aware, windows at least will automatically schedule threads to take advantage of whatever hardware is available, so all we have to do is use threads where they make sense and windows will do the rest.

Share this post


Link to post
Share on other sites
In most programs only one thread performs CPU dependant operations while other threads are used for networking, IO, etc. purely due to convinience and/or user experience. Naturally these programs won't benefit from hyperthreading. Only programs that deal will parallelizable problems can benefit from it. For instance, if you create two theads, one for physics and one for AI, both threads do expensive calculations on the CPU and don't depend on each other, then you'll be taking advantage of hyperthreading. You don't need to call any special instructions. You just need two threads that both do heavy CPU calculations at the same time and don't have to wait on each other.

Share this post


Link to post
Share on other sites
Using multiple worker threads will slow it down on single CPU (or non-ht enabled) machines. So if you're going to use it, make sure it can be turned off or disabled.

I think Quake3 uses it, not much else though.

Mark

Share this post


Link to post
Share on other sites
Does Hyper-threading work for all types of operations? IE, In my program, I spend a lot of time running code in the FPU and/or SSE2/3Dnow! instructions. Will these operations also see a speed increase?

Also, say I that my program has a fairly sensible way to break down into 2 threads. How much of a performance gain, a rough range would be nice, can I expect to get. If it isn't too much, I just wont bother, but if it is significant, I might put the extra time in.

Thanks

Dwiel

Share this post


Link to post
Share on other sites
If your code has two groups of tasks which don't need access to the same data and take roughly equal time to execute, hopefully you could get close to 100%.

But in a game, I think it's very unlikely that the two sets of tasks would be equal time or non-conflicting.

If you start doing significant amounts of locking and waiting it will quickly negate the advantage, or if one thread spends most of its time idle, the advantage will go as well.

Plus add that to the extra effort of implementing and debugging a multithreaded application, I suspect it won't be worth it.

You need to do some profiling of your two sets of tasks, ideally on a frame-by-frame basis as well as a longer term basis to judge whether they can be usefully parallelised.

Mark

Share this post


Link to post
Share on other sites
Apart from Hyperthreading which will clearly be more prevalent in the mid term future, most desktops only have one processor and are therefore not going to be able to take advantage of paralellised tasks. However, if you program for servers or mainframes (e.g. the backend of all those MMORPGWTFBBQs that people seem to want) then multiple processors will more than likely be available.

When you are at your prompt and you enter "Generate_probable_primes | Print_if_really_prime" then your computer will be creating two processes: an instance of enerate_probable_primes and an instance of Print_if_really_prime. On most Unicies, these can be automatically delegated to separate processors. They do rely on each other insofar as Generate_probable_primes passing it's stdout to Print_if_really_a_prime's stdin, but otherwise they are don't rely on each other very much and are therefore going to take advantage of being separate processes.

You can find more information on how Unix handles it in Advanced Programming in the UNIX Environment.

Share this post


Link to post
Share on other sites
Design a flexible codepath system, which will enable you to spawn some more worker threads in case the machine is either an Hyperthreading or a Multi-Cpu System.

Max % output work off of a second Cpu is 80%, that means that if your game is running at 30fps on a single cpu machine, and you add a 2nd unit, you'll probably get a speed boost up to 54fps or less.

Share this post


Link to post
Share on other sites
Quote:
Original post by Prozak
Design a flexible codepath system, which will enable you to spawn some more worker threads in case the machine is either an Hyperthreading or a Multi-Cpu System.

Max % output work off of a second Cpu is 80%, that means that if your game is running at 30fps on a single cpu machine, and you add a 2nd unit, you'll probably get a speed boost up to 54fps or less.
For home computing, sure.

On the other hand, there is a reason why the Earth Simulator has 5140 processors.

OP: if you have some serious number crunching to do or are running a server that serves many people, then multi-processor systems are very important. Consider Telephone companies, banks, or geological departments at reasearch universities. They have a lot of information to process.

However, for your videogames and whatnot, you likely don't need multiple processors since you're the only user and all tasks your computer is doing in a single application regard what you are doing. Hence they are going to be connected somehow, hence the 80% that Prozak mentions.

For playing media files, the bottleneck will be the memory and not your single ALU, so forget about 'what if I want to play two movies at once and junk'.

Share this post


Link to post
Share on other sites
Quote:
Original post by flangazor
Quote:
Original post by Prozak
Design a flexible codepath system, which will enable you to spawn some more worker threads in case the machine is either an Hyperthreading or a Multi-Cpu System.

Max % output work off of a second Cpu is 80%, that means that if your game is running at 30fps on a single cpu machine, and you add a 2nd unit, you'll probably get a speed boost up to 54fps or less.
For home computing, sure.

On the other hand, there is a reason why the Earth Simulator has 5140 processors.

OP: if you have some serious number crunching to do or are running a server that serves many people, then multi-processor systems are very important. Consider Telephone companies, banks, or geological departments at reasearch universities. They have a lot of information to process.

However, for your videogames and whatnot, you likely don't need multiple processors since you're the only user and all tasks your computer is doing in a single application regard what you are doing. Hence they are going to be connected somehow, hence the 80% that Prozak mentions.

For playing media files, the bottleneck will be the memory and not your single ALU, so forget about 'what if I want to play two movies at once and junk'.


Yeah, I am doing heavy computing on a hyper-threading machine. I actually have a network of 4 computers to work on the problem, which was surprisingly easy, but I think I might be able to gain some extra speed on my HT machine by finding a away to use threads. Although it might be worth more of my time to convert to sse2/3dnow! instructions due to the large ammounts of # crunching I am doing...

Sorry to slightly derail the thread...

Dwiel

Share this post


Link to post
Share on other sites
I would only consider threads for allowing a message pump to continue while you work on a reasonably large task and preserve "user experience" (god, what a wank word).

I would try to cut it up into processes instead of threads. The tradeoffs are that you will not be using the same stack (cannot share memory as easily) but that also means you will have your own stack that doesn't need to wait on other threads to stop poking into your memory.

If you want to start threads quickly and end them quickly, threads do that with less overhead. However, it sounds that your processes will all be continuously running.

Edit: some of the above is incorrect and discussed below. It is kept here to maintain continuity for readers. This note is here to prevent multiple "omg you are wrong!" posts.

[Edited by - flangazor on November 25, 2004 7:37:22 PM]

Share this post


Link to post
Share on other sites
flangazor: I'm fairly certain that threads have their own stack under both windows and *n?x. I don't know about *n?x, but in windows it is a nightmare to share data between processes. There are a few mechanisms for doing so, but they are extremely inefficient relative to the things you can do with threads sharing process memory.

Perhaps you are thinking about microthreads (also called fibers on windows), which use cooperative multitasking?

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by flangazor
I would try to cut it up into processes instead of threads.

In general, this is very bad advice. Under windows, Processes cost a heck of a lot more than threads.

A process is a container for a thread. A process can not execute any code, thats what the threads do.

One plus is each process has a seperate address space, which can be very desirable if you are hitting the ~2gb-3gb virtual memory limit. Under WinXP 64, you can get 4gb of virtual address space for free when running 32bit apps(when indicating you are large address space aware), as the kernel is above the 4gb line.

Share this post


Link to post
Share on other sites
Hyperthreading isn't real multiprocessing. It speeds up certain things but it's not the same as having two CPU cores. Hyperthreading works to fill holes left by unoptimized code by eliminating the cost of context-switching. If you are already taking full advantage of instruction-level parallelism then HT won't help, or will make things worse due to poor scheduling.

Quote:
However, for your videogames and whatnot, you likely don't need multiple processors since you're the only user and all tasks your computer is doing in a single application regard what you are doing.
Hmm... most people need a GPU, don't they? DSP chip for sound? What about the Playstation 3 which will be massively parallel? Or the Xbox 2?

Share this post


Link to post
Share on other sites
Your processor has multiple units for doing calculations, some working with integer data of general purpose registers, others working with floating point data of SSE registers and so on.
Because there are more of such units than a single thread could fill in most cases, even when the processor does optimizations like out-of-order execution, a second thread working in parallel can utilize them in parallel. (Probably that's what 'igni ferroque' meant by 'to fill holes left by unoptimized code').
HT does only work on task that are cpu-bound. A second thread that is responsible for networking sleeps most of the time, and you won't benefit from HT there.

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
Quote:
Original post by flangazor
I would try to cut it up into processes instead of threads.

In general, this is very bad advice. Under windows, Processes cost a heck of a lot more than threads.
They cost more in the sense that they take more time to start and more memory for overhead and context switching is more of a bother. However, if you are writing a distributed application, then processes are they way to go. You can't have separate threads of the same process running on different machines. It isn't scalable.

Also, in real time systems (a lot of consumer products), separate processors being delegated to perform specific tasks are how things are done. You don't have a main program that spawns threads for your anti-locking breaks and the air conditioner.

Maybe I misread the OP's intention. They were thinking of the computer on their desk and I was thinking of the computers that serve the OP's webpages, rendered the special effects in their favourite movies (or even the whole movie if it's Pixar), calculated the weather patterns to tell them where hurricanes are going, perform telephone exchange switching, or even the several processors in their car handling the anti-locking brakes and shifting gears for them. Those are all pretty important to consumers.
Quote:
Hmm... most people need a GPU, don't they? DSP chip for sound? What about the Playstation 3 which will be massively parallel? Or the Xbox 2?
No. Most people do not need a GPU. Most computers are for business and government work. For example, the US government is the biggest customer of Microsoft products (who therefore gave them much of their monopoly). Computers for most businesses do not need stencil buffers and they don't need DSPs for audio.

Even if you were right, and most computers needed GPUs and DSPs since all computers are home computers for gaming, this only supports what I'm on about. There is going to be little speedup from threads because everything that is an obvious task for a thread has been removed from software and many successful companies make hardware to perform that task (you only have one frame buffer (generally) and one audio being output at once and one ethernet connection at once). Bringing us back to general purpose processors giving limited returns on home computers.

(FWIW, I consider embedded sysems to be part of the computer industry and by extension, computers. Our potential disagreement may begin here.)

[Edited by - flangazor on November 25, 2004 7:19:52 PM]

Share this post


Link to post
Share on other sites
Quote:
No. Most people do not need a GPU. Most computers are for business and government work. For example, the US government is the biggest customer of Microsoft products (who therefore gave them much of their monopoly). Computers for most businesses do not need stencil buffers and they don't need DSPs for audio.
You stated that video games don't need multiple processors. My reply was poorly worded, but that statement is wrong. The Playstation 3 and Xbox 2 are examples of gaming systems that will use multiple general purpose processors. The Xbox 2 will consist of at least two dual-core PowerPC CPUs. Perhaps you meant "multiple threads," which would be more accurate.

Quote:
Could someone drop a little pseudocode or knowledge on how we as programmers are supposed to take (full) advantage of this technology or how we as consumers should know when to buy such a thing?
There are lots of resources available discussing parallel programming and the advantages and disadvantages of threads vs. processes. Spend some time with Google.

[Edited by - igni ferroque on November 25, 2004 7:19:12 PM]

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this