Archived

This topic is now archived and is closed to further replies.

hypertreading

This topic is 5010 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

There isn''t anything you need to know about hyperthreading, except it''s good for multithreaded apps.
So keep using threads where it''s usefull, threading''s what hyperthreading handles.

Share this post


Link to post
Share on other sites
quote:
Original post by tok_junior
There isn''t anything you need to know about hyperthreading, except it''s good for multithreaded apps.
So keep using threads where it''s usefull, threading''s what hyperthreading handles.



So, to increase the performance of a program you simply split it in as many threads as possible?

Share this post


Link to post
Share on other sites
There are way more issues with threading and hyper threading than simply splitting an app into multiple threads.

You don''t code for hyper threading. You code for threads in a way that makes sense. If it doesn''t make sense for your prog to be multithreaded, it won''t help you to make it multithreaded and run on a hyper threaded processor.

Share this post


Link to post
Share on other sites
quote:
Original post by Raab314159 ...split it in as many threads as possible?

And even if your progrm could be easily multithreaded, you don''t want to split it up as much as possible. Hyperthreading, if I remember right, can only do a few things at the same time; maybe even only two, so any more threads than that, and you''d not be increasing your performance anyway. A program running 32 threads on one single hyperthreaded processor, when it could be doing the same thing in 2 threads, is going to be absolutely insane.


int Agony() { return *((int*)0); }    Mwahaha... >8)

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
quote:
Original post by Raab314159
So, to increase the performance of a program you simply split it in as many threads as possible?
BIG NO! You want to keep thread count down to achieve high performance. Here''s why:

If a CPU runs more than one thread (which is always the case on any modern OS) it will of course need to switch between the threads. This is called context-switching.

A context-switch is not for free. For an application it means the OS will need to switch to kernel mode, save the CPU''s state for the currently executing thread, find the next thread to schedule, restore that thread''s CPU state, schedule that thread to run, and go back to user mode. The cost is in magnitude of several thousand CPU cycles.

With only a few executing threads, mostly not CPU-bound, this cost isn''t a problem on a GHz machine. But if your application creates 20 threads on a single CPU machine and each of them is generating prime numbers, you will start to notice the overhead of context switching.

Now, the cost of actual context switch is only one of the problems. Maybe Worse is cache misses. Each thread caches data in L1 and L2 caches. With 20 active threads say that thread 1 stores data in the cache, then thread 2 get schedlued and stores more data. After some time the L1 cache will be full and thead 1:s data will be flushed from the cache. Then when thread 1 gets scheduled to run again, it won''t benefit from the L1 cache, instead it has to fetch data from the L2 cache or probably even RAM.

Now you have designed applications that doesn''t benfit from cache memory. This phenomena is called "trashing" the caches.


The optimal solution is 1 thread per CPU. However, that''s often not practial. A typical example in a GUI application might be to have a background thread for calulations, to not block the UI. If you have 2 CPUs (which is almost what hyperthreading is about), you''re likely to have the UI thread and the background thread to run on each CPU with this approach.

A legacy server application (such as a web server, or an e-mail server), might create one thread per connected client. This is a design people generally are moving away from, as it''s bad performance wise. On Windows you want to use I/O completion ports, or use the .Net class libraries asynchronous functions. Other platforms have similar solutions.


Other than a performance issue; By randomly creating threads without really knowing why, you''re likely to introduce bad design and bugs hard to find.

One problem is synchronization. Your threads are likely to share some data and you will need to synchronize access to this data. You need good understanding to make that efficient and to avoid deadlocks and starvation.

There are a few old, but still good, MSDN articles around multithreading:

http://msdn.microsoft.com/library/techart/msdn_threads.htm

http://msdn.microsoft.com/library/en-us/dndllpro/html/msdn_threads.asp

Share this post


Link to post
Share on other sites
quote:
With only a few executing threads, mostly not CPU-bound, this cost isn't a problem on a GHz machine. But if your application creates 20 threads on a single CPU machine and each of them is generating prime numbers, you will start to notice the overhead of context switching.


Hmm, I have >250 threads and 30 processes running on my machine right now, I guess the kernel is doing some context-switching as I write this message. I'm not sure whether any of them are generating prime-numbers but that's not the point. The point is that Windows (any version >=95) is a preemptive multitasking OS, even if your application runs in a single thread, you'll still have context-switches. Of course creating more threads doesn't help, but it won't do much more overhead either.

That said, if you (the OP) want to speed up your program, randomly splitting it up into threads is NOT the way to go. If you read up on concurrent programming and multithreading (etc), you'll learn when to use threads and when not to use threads. Until then, stick to one thread!

[edited by - amag on March 26, 2004 6:31:55 PM]

Share this post


Link to post
Share on other sites
If your app is written correctly, it will only be placed in the scheduling queue when and if it has work to do (i.e. Event Driven, No Polling!).

The timer interrupt still fires all the time, and Windows checks from 50 to 100 times per second to see if it should switch threads, but it won''t switch threads if it doesn''t need to.

Share this post


Link to post
Share on other sites
quote:
Original post by amag
Hmm, I have >250 threads and 30 processes running on my machine right now...

And probably none of them are doing any intensive computations, or at most one or two of them are. However, I am guessing that the intention of the original question was to speed up a general computationally complex algorithm. If you had 30 threads, all of which that wanted as much CPU time as possible, then the context-switching would become significant, and would degrade your performance. If you have 30 threads that all only need around 2% of the CPU''s power, then you''re not going to have a problem at all. Context switching has another 40% of the CPU to work with. And most threads use much less than 2% of the CPU these days. But attempting to speed up processor-hungry algorithms by splitting it into more threads than you have separate (possibly simulated) processors will fail in almost all, if not all, cases.

Share this post


Link to post
Share on other sites
At the moment I have 309 threads, but I certainly do not have 309 threads actually running. Instead, there are simply 309 threads that were created and not yet destroyed. But applications, and their threads, spend most of their time idle, meaning they told the OS not to run their code until they receive a message (at least that''s how Windows works). So a word processor will do nothing until the user presses a key or uses the mouse, and if you Alt+Tab out of a game, its code will stop running until you switch back to it, if it''s decently programmed.

That''s why, in Windows XP, you can see nearly all of the processes are using 0% of the CPU.

~CGameProgrammer( );

Screenshots of your games or desktop captures -- Upload up to four 1600x1200 screenshots of your projects, registration optional. View all existing ones in the archives..

Share this post


Link to post
Share on other sites
hyperthreading info (more than you can poke a stick at

http://www.intel.com/cd/ids/developer/asmo-na/eng/microprocessors/ia32/pentium4/hyperthreading/index.htm

some of the papers suggest its use almost as a form of micro optimization, I still havent played with the technology enough to make any kind of comment.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
quote:
Original post by Magmai Kai Holmlor
If your app is written correctly, it will only be placed in the scheduling queue when and if it has work to do (i.e. Event Driven, No Polling!).
This is true for a client side application with one or a few threads, but not for a server side application designed to server thousands of users per second.

If you write a server side application and create one thread per user you will be able to handle a significantly lower number of users compared to using async i/o. Async i/o and i/o completion ports is the way to go (refer to MSDN articles about server programming, or Jeffrey Richter''s books dealing with this.

Share this post


Link to post
Share on other sites